|
- ============================================================
- Extending LLVM: Adding instructions, intrinsics, types, etc.
- ============================================================
- Introduction and Warning
- ========================
- During the course of using LLVM, you may wish to customize it for your research
- project or for experimentation. At this point, you may realize that you need to
- add something to LLVM, whether it be a new fundamental type, a new intrinsic
- function, or a whole new instruction.
- When you come to this realization, stop and think. Do you really need to extend
- LLVM? Is it a new fundamental capability that LLVM does not support at its
- current incarnation or can it be synthesized from already pre-existing LLVM
- elements? If you are not sure, ask on the `LLVM-dev
- <http://lists.llvm.org/mailman/listinfo/llvm-dev>`_ list. The reason is that
- extending LLVM will get involved as you need to update all the different passes
- that you intend to use with your extension, and there are ``many`` LLVM analyses
- and transformations, so it may be quite a bit of work.
- Adding an `intrinsic function`_ is far easier than adding an
- instruction, and is transparent to optimization passes. If your added
- functionality can be expressed as a function call, an intrinsic function is the
- method of choice for LLVM extension.
- Before you invest a significant amount of effort into a non-trivial extension,
- **ask on the list** if what you are looking to do can be done with
- already-existing infrastructure, or if maybe someone else is already working on
- it. You will save yourself a lot of time and effort by doing so.
- .. _intrinsic function:
- Adding a new intrinsic function
- ===============================
- Adding a new intrinsic function to LLVM is much easier than adding a new
- instruction. Almost all extensions to LLVM should start as an intrinsic
- function and then be turned into an instruction if warranted.
- #. ``llvm/docs/LangRef.html``:
- Document the intrinsic. Decide whether it is code generator specific and
- what the restrictions are. Talk to other people about it so that you are
- sure it's a good idea.
- #. ``llvm/include/llvm/IR/Intrinsics*.td``:
- Add an entry for your intrinsic. Describe its memory access
- characteristics for optimization (this controls whether it will be
- DCE'd, CSE'd, etc). If any arguments need to be immediates, these
- must be indicated with the ImmArg property. Note that any intrinsic
- using one of the ``llvm_any*_ty`` types for an argument or return
- type will be deemed by ``tblgen`` as overloaded and the
- corresponding suffix will be required on the intrinsic's name.
- #. ``llvm/lib/Analysis/ConstantFolding.cpp``:
- If it is possible to constant fold your intrinsic, add support to it in the
- ``canConstantFoldCallTo`` and ``ConstantFoldCall`` functions.
- #. ``llvm/test/*``:
- Add test cases for your test cases to the test suite
- Once the intrinsic has been added to the system, you must add code generator
- support for it. Generally you must do the following steps:
- Add support to the .td file for the target(s) of your choice in
- ``lib/Target/*/*.td``.
- This is usually a matter of adding a pattern to the .td file that matches the
- intrinsic, though it may obviously require adding the instructions you want to
- generate as well. There are lots of examples in the PowerPC and X86 backend
- to follow.
- Adding a new SelectionDAG node
- ==============================
- As with intrinsics, adding a new SelectionDAG node to LLVM is much easier than
- adding a new instruction. New nodes are often added to help represent
- instructions common to many targets. These nodes often map to an LLVM
- instruction (add, sub) or intrinsic (byteswap, population count). In other
- cases, new nodes have been added to allow many targets to perform a common task
- (converting between floating point and integer representation) or capture more
- complicated behavior in a single node (rotate).
- #. ``include/llvm/CodeGen/ISDOpcodes.h``:
- Add an enum value for the new SelectionDAG node.
- #. ``lib/CodeGen/SelectionDAG/SelectionDAG.cpp``:
- Add code to print the node to ``getOperationName``. If your new node can be
- evaluated at compile time when given constant arguments (such as an add of a
- constant with another constant), find the ``getNode`` method that takes the
- appropriate number of arguments, and add a case for your node to the switch
- statement that performs constant folding for nodes that take the same number
- of arguments as your new node.
- #. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``:
- Add code to `legalize, promote, and expand
- <CodeGenerator.html#selectiondag_legalize>`_ the node as necessary. At a
- minimum, you will need to add a case statement for your node in
- ``LegalizeOp`` which calls LegalizeOp on the node's operands, and returns a
- new node if any of the operands changed as a result of being legalized. It
- is likely that not all targets supported by the SelectionDAG framework will
- natively support the new node. In this case, you must also add code in your
- node's case statement in ``LegalizeOp`` to Expand your node into simpler,
- legal operations. The case for ``ISD::UREM`` for expanding a remainder into
- a divide, multiply, and a subtract is a good example.
- #. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``:
- If targets may support the new node being added only at certain sizes, you
- will also need to add code to your node's case statement in ``LegalizeOp``
- to Promote your node's operands to a larger size, and perform the correct
- operation. You will also need to add code to ``PromoteOp`` to do this as
- well. For a good example, see ``ISD::BSWAP``, which promotes its operand to
- a wider size, performs the byteswap, and then shifts the correct bytes right
- to emulate the narrower byteswap in the wider type.
- #. ``lib/CodeGen/SelectionDAG/LegalizeDAG.cpp``:
- Add a case for your node in ``ExpandOp`` to teach the legalizer how to
- perform the action represented by the new node on a value that has been split
- into high and low halves. This case will be used to support your node with a
- 64 bit operand on a 32 bit target.
- #. ``lib/CodeGen/SelectionDAG/DAGCombiner.cpp``:
- If your node can be combined with itself, or other existing nodes in a
- peephole-like fashion, add a visit function for it, and call that function
- from. There are several good examples for simple combines you can do;
- ``visitFABS`` and ``visitSRL`` are good starting places.
- #. ``lib/Target/PowerPC/PPCISelLowering.cpp``:
- Each target has an implementation of the ``TargetLowering`` class, usually in
- its own file (although some targets include it in the same file as the
- DAGToDAGISel). The default behavior for a target is to assume that your new
- node is legal for all types that are legal for that target. If this target
- does not natively support your node, then tell the target to either Promote
- it (if it is supported at a larger type) or Expand it. This will cause the
- code you wrote in ``LegalizeOp`` above to decompose your new node into other
- legal nodes for this target.
- #. ``lib/Target/TargetSelectionDAG.td``:
- Most current targets supported by LLVM generate code using the DAGToDAG
- method, where SelectionDAG nodes are pattern matched to target-specific
- nodes, which represent individual instructions. In order for the targets to
- match an instruction to your new node, you must add a def for that node to
- the list in this file, with the appropriate type constraints. Look at
- ``add``, ``bswap``, and ``fadd`` for examples.
- #. ``lib/Target/PowerPC/PPCInstrInfo.td``:
- Each target has a tablegen file that describes the target's instruction set.
- For targets that use the DAGToDAG instruction selection framework, add a
- pattern for your new node that uses one or more target nodes. Documentation
- for this is a bit sparse right now, but there are several decent examples.
- See the patterns for ``rotl`` in ``PPCInstrInfo.td``.
- #. TODO: document complex patterns.
- #. ``llvm/test/CodeGen/*``:
- Add test cases for your new node to the test suite.
- ``llvm/test/CodeGen/X86/bswap.ll`` is a good example.
- Adding a new instruction
- ========================
- .. warning::
- Adding instructions changes the bitcode format, and it will take some effort
- to maintain compatibility with the previous version. Only add an instruction
- if it is absolutely necessary.
- #. ``llvm/include/llvm/IR/Instruction.def``:
- add a number for your instruction and an enum name
- #. ``llvm/include/llvm/IR/Instructions.h``:
- add a definition for the class that will represent your instruction
- #. ``llvm/include/llvm/IR/InstVisitor.h``:
- add a prototype for a visitor to your new instruction type
- #. ``llvm/lib/AsmParser/LLLexer.cpp``:
- add a new token to parse your instruction from assembly text file
- #. ``llvm/lib/AsmParser/LLParser.cpp``:
- add the grammar on how your instruction can be read and what it will
- construct as a result
- #. ``llvm/lib/Bitcode/Reader/BitcodeReader.cpp``:
- add a case for your instruction and how it will be parsed from bitcode
- #. ``llvm/lib/Bitcode/Writer/BitcodeWriter.cpp``:
- add a case for your instruction and how it will be parsed from bitcode
- #. ``llvm/lib/IR/Instruction.cpp``:
- add a case for how your instruction will be printed out to assembly
- #. ``llvm/lib/IR/Instructions.cpp``:
- implement the class you defined in ``llvm/include/llvm/Instructions.h``
- #. Test your instruction
- #. ``llvm/lib/Target/*``:
- add support for your instruction to code generators, or add a lowering pass.
- #. ``llvm/test/*``:
- add your test cases to the test suite.
- Also, you need to implement (or modify) any analyses or passes that you want to
- understand this new instruction.
- Adding a new type
- =================
- .. warning::
- Adding new types changes the bitcode format, and will break compatibility with
- currently-existing LLVM installations. Only add new types if it is absolutely
- necessary.
- Adding a fundamental type
- -------------------------
- #. ``llvm/include/llvm/IR/Type.h``:
- add enum for the new type; add static ``Type*`` for this type
- #. ``llvm/lib/IR/Type.cpp`` and ``llvm/lib/IR/ValueTypes.cpp``:
- add mapping from ``TypeID`` => ``Type*``; initialize the static ``Type*``
- #. ``llvm/llvm/llvm-c/Core.cpp``:
- add enum ``LLVMTypeKind`` and modify
- ``LLVMTypeKind LLVMGetTypeKind(LLVMTypeRef Ty)`` for the new type
- #. ``llvm/lib/AsmParser/LLLexer.cpp``:
- add ability to parse in the type from text assembly
- #. ``llvm/lib/AsmParser/LLParser.cpp``:
- add a token for that type
- #. ``llvm/lib/Bitcode/Writer/BitcodeWriter.cpp``:
- modify ``static void WriteTypeTable(const ValueEnumerator &VE,
- BitstreamWriter &Stream)`` to serialize your type
- #. ``llvm/lib/Bitcode/Reader/BitcodeReader.cpp``:
- modify ``bool BitcodeReader::ParseTypeType()`` to read your data type
- #. ``include/llvm/Bitcode/LLVMBitCodes.h``:
- add enum ``TypeCodes`` for the new type
- Adding a derived type
- ---------------------
- #. ``llvm/include/llvm/IR/Type.h``:
- add enum for the new type; add a forward declaration of the type also
- #. ``llvm/include/llvm/IR/DerivedTypes.h``:
- add new class to represent new class in the hierarchy; add forward
- declaration to the TypeMap value type
- #. ``llvm/lib/IR/Type.cpp`` and ``llvm/lib/IR/ValueTypes.cpp``:
- add support for derived type, notably `enum TypeID` and `is`, `get` methods.
- #. ``llvm/llvm/llvm-c/Core.cpp``:
- add enum ``LLVMTypeKind`` and modify
- `LLVMTypeKind LLVMGetTypeKind(LLVMTypeRef Ty)` for the new type
- #. ``llvm/lib/AsmParser/LLLexer.cpp``:
- modify ``lltok::Kind LLLexer::LexIdentifier()`` to add ability to
- parse in the type from text assembly
- #. ``llvm/lib/Bitcode/Writer/BitcodeWriter.cpp``:
- modify ``static void WriteTypeTable(const ValueEnumerator &VE,
- BitstreamWriter &Stream)`` to serialize your type
- #. ``llvm/lib/Bitcode/Reader/BitcodeReader.cpp``:
- modify ``bool BitcodeReader::ParseTypeType()`` to read your data type
- #. ``include/llvm/Bitcode/LLVMBitCodes.h``:
- add enum ``TypeCodes`` for the new type
- #. ``llvm/lib/IR/AsmWriter.cpp``:
- modify ``void TypePrinting::print(Type *Ty, raw_ostream &OS)``
- to output the new derived type
|