MCJITDesignAndImplementation.rst 8.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180
  1. ===============================
  2. MCJIT Design and Implementation
  3. ===============================
  4. Introduction
  5. ============
  6. This document describes the internal workings of the MCJIT execution
  7. engine and the RuntimeDyld component. It is intended as a high level
  8. overview of the implementation, showing the flow and interactions of
  9. objects throughout the code generation and dynamic loading process.
  10. Engine Creation
  11. ===============
  12. In most cases, an EngineBuilder object is used to create an instance of
  13. the MCJIT execution engine. The EngineBuilder takes an llvm::Module
  14. object as an argument to its constructor. The client may then set various
  15. options that we control the later be passed along to the MCJIT engine,
  16. including the selection of MCJIT as the engine type to be created.
  17. Of particular interest is the EngineBuilder::setMCJITMemoryManager
  18. function. If the client does not explicitly create a memory manager at
  19. this time, a default memory manager (specifically SectionMemoryManager)
  20. will be created when the MCJIT engine is instantiated.
  21. Once the options have been set, a client calls EngineBuilder::create to
  22. create an instance of the MCJIT engine. If the client does not use the
  23. form of this function that takes a TargetMachine as a parameter, a new
  24. TargetMachine will be created based on the target triple associated with
  25. the Module that was used to create the EngineBuilder.
  26. .. image:: MCJIT-engine-builder.png
  27. EngineBuilder::create will call the static MCJIT::createJIT function,
  28. passing in its pointers to the module, memory manager and target machine
  29. objects, all of which will subsequently be owned by the MCJIT object.
  30. The MCJIT class has a member variable, Dyld, which contains an instance of
  31. the RuntimeDyld wrapper class. This member will be used for
  32. communications between MCJIT and the actual RuntimeDyldImpl object that
  33. gets created when an object is loaded.
  34. .. image:: MCJIT-creation.png
  35. Upon creation, MCJIT holds a pointer to the Module object that it received
  36. from EngineBuilder but it does not immediately generate code for this
  37. module. Code generation is deferred until either the
  38. MCJIT::finalizeObject method is called explicitly or a function such as
  39. MCJIT::getPointerToFunction is called which requires the code to have been
  40. generated.
  41. Code Generation
  42. ===============
  43. When code generation is triggered, as described above, MCJIT will first
  44. attempt to retrieve an object image from its ObjectCache member, if one
  45. has been set. If a cached object image cannot be retrieved, MCJIT will
  46. call its emitObject method. MCJIT::emitObject uses a local PassManager
  47. instance and creates a new ObjectBufferStream instance, both of which it
  48. passes to TargetMachine::addPassesToEmitMC before calling PassManager::run
  49. on the Module with which it was created.
  50. .. image:: MCJIT-load.png
  51. The PassManager::run call causes the MC code generation mechanisms to emit
  52. a complete relocatable binary object image (either in either ELF or MachO
  53. format, depending on the target) into the ObjectBufferStream object, which
  54. is flushed to complete the process. If an ObjectCache is being used, the
  55. image will be passed to the ObjectCache here.
  56. At this point, the ObjectBufferStream contains the raw object image.
  57. Before the code can be executed, the code and data sections from this
  58. image must be loaded into suitable memory, relocations must be applied and
  59. memory permission and code cache invalidation (if required) must be completed.
  60. Object Loading
  61. ==============
  62. Once an object image has been obtained, either through code generation or
  63. having been retrieved from an ObjectCache, it is passed to RuntimeDyld to
  64. be loaded. The RuntimeDyld wrapper class examines the object to determine
  65. its file format and creates an instance of either RuntimeDyldELF or
  66. RuntimeDyldMachO (both of which derive from the RuntimeDyldImpl base
  67. class) and calls the RuntimeDyldImpl::loadObject method to perform that
  68. actual loading.
  69. .. image:: MCJIT-dyld-load.png
  70. RuntimeDyldImpl::loadObject begins by creating an ObjectImage instance
  71. from the ObjectBuffer it received. ObjectImage, which wraps the
  72. ObjectFile class, is a helper class which parses the binary object image
  73. and provides access to the information contained in the format-specific
  74. headers, including section, symbol and relocation information.
  75. RuntimeDyldImpl::loadObject then iterates through the symbols in the
  76. image. Information about common symbols is collected for later use. For
  77. each function or data symbol, the associated section is loaded into memory
  78. and the symbol is stored in a symbol table map data structure. When the
  79. iteration is complete, a section is emitted for the common symbols.
  80. Next, RuntimeDyldImpl::loadObject iterates through the sections in the
  81. object image and for each section iterates through the relocations for
  82. that sections. For each relocation, it calls the format-specific
  83. processRelocationRef method, which will examine the relocation and store
  84. it in one of two data structures, a section-based relocation list map and
  85. an external symbol relocation map.
  86. .. image:: MCJIT-load-object.png
  87. When RuntimeDyldImpl::loadObject returns, all of the code and data
  88. sections for the object will have been loaded into memory allocated by the
  89. memory manager and relocation information will have been prepared, but the
  90. relocations have not yet been applied and the generated code is still not
  91. ready to be executed.
  92. [Currently (as of August 2013) the MCJIT engine will immediately apply
  93. relocations when loadObject completes. However, this shouldn't be
  94. happening. Because the code may have been generated for a remote target,
  95. the client should be given a chance to re-map the section addresses before
  96. relocations are applied. It is possible to apply relocations multiple
  97. times, but in the case where addresses are to be re-mapped, this first
  98. application is wasted effort.]
  99. Address Remapping
  100. =================
  101. At any time after initial code has been generated and before
  102. finalizeObject is called, the client can remap the address of sections in
  103. the object. Typically this is done because the code was generated for an
  104. external process and is being mapped into that process' address space.
  105. The client remaps the section address by calling MCJIT::mapSectionAddress.
  106. This should happen before the section memory is copied to its new
  107. location.
  108. When MCJIT::mapSectionAddress is called, MCJIT passes the call on to
  109. RuntimeDyldImpl (via its Dyld member). RuntimeDyldImpl stores the new
  110. address in an internal data structure but does not update the code at this
  111. time, since other sections are likely to change.
  112. When the client is finished remapping section addresses, it will call
  113. MCJIT::finalizeObject to complete the remapping process.
  114. Final Preparations
  115. ==================
  116. When MCJIT::finalizeObject is called, MCJIT calls
  117. RuntimeDyld::resolveRelocations. This function will attempt to locate any
  118. external symbols and then apply all relocations for the object.
  119. External symbols are resolved by calling the memory manager's
  120. getPointerToNamedFunction method. The memory manager will return the
  121. address of the requested symbol in the target address space. (Note, this
  122. may not be a valid pointer in the host process.) RuntimeDyld will then
  123. iterate through the list of relocations it has stored which are associated
  124. with this symbol and invoke the resolveRelocation method which, through an
  125. format-specific implementation, will apply the relocation to the loaded
  126. section memory.
  127. Next, RuntimeDyld::resolveRelocations iterates through the list of
  128. sections and for each section iterates through a list of relocations that
  129. have been saved which reference that symbol and call resolveRelocation for
  130. each entry in this list. The relocation list here is a list of
  131. relocations for which the symbol associated with the relocation is located
  132. in the section associated with the list. Each of these locations will
  133. have a target location at which the relocation will be applied that is
  134. likely located in a different section.
  135. .. image:: MCJIT-resolve-relocations.png
  136. Once relocations have been applied as described above, MCJIT calls
  137. RuntimeDyld::getEHFrameSection, and if a non-zero result is returned
  138. passes the section data to the memory manager's registerEHFrames method.
  139. This allows the memory manager to call any desired target-specific
  140. functions, such as registering the EH frame information with a debugger.
  141. Finally, MCJIT calls the memory manager's finalizeMemory method. In this
  142. method, the memory manager will invalidate the target code cache, if
  143. necessary, and apply final permissions to the memory pages it has
  144. allocated for code and data memory.