123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180 |
- ===============================
- MCJIT Design and Implementation
- ===============================
- Introduction
- ============
- This document describes the internal workings of the MCJIT execution
- engine and the RuntimeDyld component. It is intended as a high level
- overview of the implementation, showing the flow and interactions of
- objects throughout the code generation and dynamic loading process.
- Engine Creation
- ===============
- In most cases, an EngineBuilder object is used to create an instance of
- the MCJIT execution engine. The EngineBuilder takes an llvm::Module
- object as an argument to its constructor. The client may then set various
- options that we control the later be passed along to the MCJIT engine,
- including the selection of MCJIT as the engine type to be created.
- Of particular interest is the EngineBuilder::setMCJITMemoryManager
- function. If the client does not explicitly create a memory manager at
- this time, a default memory manager (specifically SectionMemoryManager)
- will be created when the MCJIT engine is instantiated.
- Once the options have been set, a client calls EngineBuilder::create to
- create an instance of the MCJIT engine. If the client does not use the
- form of this function that takes a TargetMachine as a parameter, a new
- TargetMachine will be created based on the target triple associated with
- the Module that was used to create the EngineBuilder.
- .. image:: MCJIT-engine-builder.png
-
- EngineBuilder::create will call the static MCJIT::createJIT function,
- passing in its pointers to the module, memory manager and target machine
- objects, all of which will subsequently be owned by the MCJIT object.
- The MCJIT class has a member variable, Dyld, which contains an instance of
- the RuntimeDyld wrapper class. This member will be used for
- communications between MCJIT and the actual RuntimeDyldImpl object that
- gets created when an object is loaded.
- .. image:: MCJIT-creation.png
-
- Upon creation, MCJIT holds a pointer to the Module object that it received
- from EngineBuilder but it does not immediately generate code for this
- module. Code generation is deferred until either the
- MCJIT::finalizeObject method is called explicitly or a function such as
- MCJIT::getPointerToFunction is called which requires the code to have been
- generated.
- Code Generation
- ===============
- When code generation is triggered, as described above, MCJIT will first
- attempt to retrieve an object image from its ObjectCache member, if one
- has been set. If a cached object image cannot be retrieved, MCJIT will
- call its emitObject method. MCJIT::emitObject uses a local PassManager
- instance and creates a new ObjectBufferStream instance, both of which it
- passes to TargetMachine::addPassesToEmitMC before calling PassManager::run
- on the Module with which it was created.
- .. image:: MCJIT-load.png
-
- The PassManager::run call causes the MC code generation mechanisms to emit
- a complete relocatable binary object image (either in either ELF or MachO
- format, depending on the target) into the ObjectBufferStream object, which
- is flushed to complete the process. If an ObjectCache is being used, the
- image will be passed to the ObjectCache here.
- At this point, the ObjectBufferStream contains the raw object image.
- Before the code can be executed, the code and data sections from this
- image must be loaded into suitable memory, relocations must be applied and
- memory permission and code cache invalidation (if required) must be completed.
- Object Loading
- ==============
- Once an object image has been obtained, either through code generation or
- having been retrieved from an ObjectCache, it is passed to RuntimeDyld to
- be loaded. The RuntimeDyld wrapper class examines the object to determine
- its file format and creates an instance of either RuntimeDyldELF or
- RuntimeDyldMachO (both of which derive from the RuntimeDyldImpl base
- class) and calls the RuntimeDyldImpl::loadObject method to perform that
- actual loading.
- .. image:: MCJIT-dyld-load.png
-
- RuntimeDyldImpl::loadObject begins by creating an ObjectImage instance
- from the ObjectBuffer it received. ObjectImage, which wraps the
- ObjectFile class, is a helper class which parses the binary object image
- and provides access to the information contained in the format-specific
- headers, including section, symbol and relocation information.
- RuntimeDyldImpl::loadObject then iterates through the symbols in the
- image. Information about common symbols is collected for later use. For
- each function or data symbol, the associated section is loaded into memory
- and the symbol is stored in a symbol table map data structure. When the
- iteration is complete, a section is emitted for the common symbols.
- Next, RuntimeDyldImpl::loadObject iterates through the sections in the
- object image and for each section iterates through the relocations for
- that sections. For each relocation, it calls the format-specific
- processRelocationRef method, which will examine the relocation and store
- it in one of two data structures, a section-based relocation list map and
- an external symbol relocation map.
- .. image:: MCJIT-load-object.png
-
- When RuntimeDyldImpl::loadObject returns, all of the code and data
- sections for the object will have been loaded into memory allocated by the
- memory manager and relocation information will have been prepared, but the
- relocations have not yet been applied and the generated code is still not
- ready to be executed.
- [Currently (as of August 2013) the MCJIT engine will immediately apply
- relocations when loadObject completes. However, this shouldn't be
- happening. Because the code may have been generated for a remote target,
- the client should be given a chance to re-map the section addresses before
- relocations are applied. It is possible to apply relocations multiple
- times, but in the case where addresses are to be re-mapped, this first
- application is wasted effort.]
- Address Remapping
- =================
- At any time after initial code has been generated and before
- finalizeObject is called, the client can remap the address of sections in
- the object. Typically this is done because the code was generated for an
- external process and is being mapped into that process' address space.
- The client remaps the section address by calling MCJIT::mapSectionAddress.
- This should happen before the section memory is copied to its new
- location.
- When MCJIT::mapSectionAddress is called, MCJIT passes the call on to
- RuntimeDyldImpl (via its Dyld member). RuntimeDyldImpl stores the new
- address in an internal data structure but does not update the code at this
- time, since other sections are likely to change.
- When the client is finished remapping section addresses, it will call
- MCJIT::finalizeObject to complete the remapping process.
- Final Preparations
- ==================
- When MCJIT::finalizeObject is called, MCJIT calls
- RuntimeDyld::resolveRelocations. This function will attempt to locate any
- external symbols and then apply all relocations for the object.
- External symbols are resolved by calling the memory manager's
- getPointerToNamedFunction method. The memory manager will return the
- address of the requested symbol in the target address space. (Note, this
- may not be a valid pointer in the host process.) RuntimeDyld will then
- iterate through the list of relocations it has stored which are associated
- with this symbol and invoke the resolveRelocation method which, through an
- format-specific implementation, will apply the relocation to the loaded
- section memory.
- Next, RuntimeDyld::resolveRelocations iterates through the list of
- sections and for each section iterates through a list of relocations that
- have been saved which reference that symbol and call resolveRelocation for
- each entry in this list. The relocation list here is a list of
- relocations for which the symbol associated with the relocation is located
- in the section associated with the list. Each of these locations will
- have a target location at which the relocation will be applied that is
- likely located in a different section.
- .. image:: MCJIT-resolve-relocations.png
-
- Once relocations have been applied as described above, MCJIT calls
- RuntimeDyld::getEHFrameSection, and if a non-zero result is returned
- passes the section data to the memory manager's registerEHFrames method.
- This allows the memory manager to call any desired target-specific
- functions, such as registering the EH frame information with a debugger.
- Finally, MCJIT calls the memory manager's finalizeMemory method. In this
- method, the memory manager will invalidate the target code cache, if
- necessary, and apply final permissions to the memory pages it has
- allocated for code and data memory.
|