Extensions.rst 16 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558
  1. ===============
  2. LLVM Extensions
  3. ===============
  4. .. contents::
  5. :local:
  6. .. toctree::
  7. :hidden:
  8. Introduction
  9. ============
  10. This document describes extensions to tools and formats LLVM seeks compatibility
  11. with.
  12. General Assembly Syntax
  13. ===========================
  14. C99-style Hexadecimal Floating-point Constants
  15. ----------------------------------------------
  16. LLVM's assemblers allow floating-point constants to be written in C99's
  17. hexadecimal format instead of decimal if desired.
  18. .. code-block:: gas
  19. .section .data
  20. .float 0x1c2.2ap3
  21. Machine-specific Assembly Syntax
  22. ================================
  23. X86/COFF-Dependent
  24. ------------------
  25. Relocations
  26. ^^^^^^^^^^^
  27. The following additional relocation types are supported:
  28. **@IMGREL** (AT&T syntax only) generates an image-relative relocation that
  29. corresponds to the COFF relocation types ``IMAGE_REL_I386_DIR32NB`` (32-bit) or
  30. ``IMAGE_REL_AMD64_ADDR32NB`` (64-bit).
  31. .. code-block:: text
  32. .text
  33. fun:
  34. mov foo@IMGREL(%ebx, %ecx, 4), %eax
  35. .section .pdata
  36. .long fun@IMGREL
  37. .long (fun@imgrel + 0x3F)
  38. .long $unwind$fun@imgrel
  39. **.secrel32** generates a relocation that corresponds to the COFF relocation
  40. types ``IMAGE_REL_I386_SECREL`` (32-bit) or ``IMAGE_REL_AMD64_SECREL`` (64-bit).
  41. **.secidx** relocation generates an index of the section that contains
  42. the target. It corresponds to the COFF relocation types
  43. ``IMAGE_REL_I386_SECTION`` (32-bit) or ``IMAGE_REL_AMD64_SECTION`` (64-bit).
  44. .. code-block:: none
  45. .section .debug$S,"rn"
  46. .long 4
  47. .long 242
  48. .long 40
  49. .secrel32 _function_name + 0
  50. .secidx _function_name
  51. ...
  52. ``.linkonce`` Directive
  53. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  54. Syntax:
  55. ``.linkonce [ comdat type ]``
  56. Supported COMDAT types:
  57. ``discard``
  58. Discards duplicate sections with the same COMDAT symbol. This is the default
  59. if no type is specified.
  60. ``one_only``
  61. If the symbol is defined multiple times, the linker issues an error.
  62. ``same_size``
  63. Duplicates are discarded, but the linker issues an error if any have
  64. different sizes.
  65. ``same_contents``
  66. Duplicates are discarded, but the linker issues an error if any duplicates
  67. do not have exactly the same content.
  68. ``largest``
  69. Links the largest section from among the duplicates.
  70. ``newest``
  71. Links the newest section from among the duplicates.
  72. .. code-block:: gas
  73. .section .text$foo
  74. .linkonce
  75. ...
  76. ``.section`` Directive
  77. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  78. MC supports passing the information in ``.linkonce`` at the end of
  79. ``.section``. For example, these two codes are equivalent
  80. .. code-block:: gas
  81. .section secName, "dr", discard, "Symbol1"
  82. .globl Symbol1
  83. Symbol1:
  84. .long 1
  85. .. code-block:: gas
  86. .section secName, "dr"
  87. .linkonce discard
  88. .globl Symbol1
  89. Symbol1:
  90. .long 1
  91. Note that in the combined form the COMDAT symbol is explicit. This
  92. extension exists to support multiple sections with the same name in
  93. different COMDATs:
  94. .. code-block:: gas
  95. .section secName, "dr", discard, "Symbol1"
  96. .globl Symbol1
  97. Symbol1:
  98. .long 1
  99. .section secName, "dr", discard, "Symbol2"
  100. .globl Symbol2
  101. Symbol2:
  102. .long 1
  103. In addition to the types allowed with ``.linkonce``, ``.section`` also accepts
  104. ``associative``. The meaning is that the section is linked if a certain other
  105. COMDAT section is linked. This other section is indicated by the comdat symbol
  106. in this directive. It can be any symbol defined in the associated section, but
  107. is usually the associated section's comdat.
  108. The following restrictions apply to the associated section:
  109. 1. It must be a COMDAT section.
  110. 2. It cannot be another associative COMDAT section.
  111. In the following example the symobl ``sym`` is the comdat symbol of ``.foo``
  112. and ``.bar`` is associated to ``.foo``.
  113. .. code-block:: gas
  114. .section .foo,"bw",discard, "sym"
  115. .section .bar,"rd",associative, "sym"
  116. MC supports these flags in the COFF ``.section`` directive:
  117. - ``b``: BSS section (``IMAGE_SCN_CNT_INITIALIZED_DATA``)
  118. - ``d``: Data section (``IMAGE_SCN_CNT_UNINITIALIZED_DATA``)
  119. - ``n``: Section is not loaded (``IMAGE_SCN_LNK_REMOVE``)
  120. - ``r``: Read-only
  121. - ``s``: Shared section
  122. - ``w``: Writable
  123. - ``x``: Executable section
  124. - ``y``: Not readable
  125. - ``D``: Discardable (``IMAGE_SCN_MEM_DISCARDABLE``)
  126. These flags are all compatible with gas, with the exception of the ``D`` flag,
  127. which gnu as does not support. For gas compatibility, sections with a name
  128. starting with ".debug" are implicitly discardable.
  129. ARM64/COFF-Dependent
  130. --------------------
  131. Relocations
  132. ^^^^^^^^^^^
  133. The following additional symbol variants are supported:
  134. **:secrel_lo12:** generates a relocation that corresponds to the COFF relocation
  135. types ``IMAGE_REL_ARM64_SECREL_LOW12A`` or ``IMAGE_REL_ARM64_SECREL_LOW12L``.
  136. **:secrel_hi12:** generates a relocation that corresponds to the COFF relocation
  137. type ``IMAGE_REL_ARM64_SECREL_HIGH12A``.
  138. .. code-block:: gas
  139. add x0, x0, :secrel_hi12:symbol
  140. ldr x0, [x0, :secrel_lo12:symbol]
  141. add x1, x1, :secrel_hi12:symbol
  142. add x1, x1, :secrel_lo12:symbol
  143. ...
  144. ELF-Dependent
  145. -------------
  146. ``.section`` Directive
  147. ^^^^^^^^^^^^^^^^^^^^^^
  148. In order to support creating multiple sections with the same name and comdat,
  149. it is possible to add an unique number at the end of the ``.seciton`` directive.
  150. For example, the following code creates two sections named ``.text``.
  151. .. code-block:: gas
  152. .section .text,"ax",@progbits,unique,1
  153. nop
  154. .section .text,"ax",@progbits,unique,2
  155. nop
  156. The unique number is not present in the resulting object at all. It is just used
  157. in the assembler to differentiate the sections.
  158. The 'o' flag is mapped to SHF_LINK_ORDER. If it is present, a symbol
  159. must be given that identifies the section to be placed is the
  160. .sh_link.
  161. .. code-block:: gas
  162. .section .foo,"a",@progbits
  163. .Ltmp:
  164. .section .bar,"ao",@progbits,.Ltmp
  165. which is equivalent to just
  166. .. code-block:: gas
  167. .section .foo,"a",@progbits
  168. .section .bar,"ao",@progbits,.foo
  169. ``.linker-options`` Section (linker options)
  170. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  171. In order to support passing linker options from the frontend to the linker, a
  172. special section of type ``SHT_LLVM_LINKER_OPTIONS`` (usually named
  173. ``.linker-options`` though the name is not significant as it is identified by
  174. the type). The contents of this section is a simple pair-wise encoding of
  175. directives for consideration by the linker. The strings are encoded as standard
  176. null-terminated UTF-8 strings. They are emitted inline to avoid having the
  177. linker traverse the object file for retrieving the value. The linker is
  178. permitted to not honour the option and instead provide a warning/error to the
  179. user that the requested option was not honoured.
  180. The section has type ``SHT_LLVM_LINKER_OPTIONS`` and has the ``SHF_EXCLUDE``
  181. flag to ensure that the section is treated as opaque by linkers which do not
  182. support the feature and will not be emitted into the final linked binary.
  183. This would be equivalent to the follow raw assembly:
  184. .. code-block:: gas
  185. .section ".linker-options","e",@llvm_linker_options
  186. .asciz "option 1"
  187. .asciz "value 1"
  188. .asciz "option 2"
  189. .asciz "value 2"
  190. The following directives are specified:
  191. - lib
  192. The parameter identifies a library to be linked against. The library will
  193. be looked up in the default and any specified library search paths
  194. (specified to this point).
  195. - libpath
  196. The paramter identifies an additional library search path to be considered
  197. when looking up libraries after the inclusion of this option.
  198. ``SHT_LLVM_DEPENDENT_LIBRARIES`` Section (Dependent Libraries)
  199. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  200. This section contains strings specifying libraries to be added to the link by
  201. the linker.
  202. The section should be consumed by the linker and not written to the output.
  203. The strings are encoded as standard null-terminated UTF-8 strings.
  204. For example:
  205. .. code-block:: gas
  206. .section ".deplibs","MS",@llvm_dependent_libraries,1
  207. .asciz "library specifier 1"
  208. .asciz "library specifier 2"
  209. The interpretation of the library specifiers is defined by the consuming linker.
  210. ``SHT_LLVM_CALL_GRAPH_PROFILE`` Section (Call Graph Profile)
  211. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  212. This section is used to pass a call graph profile to the linker which can be
  213. used to optimize the placement of sections. It contains a sequence of
  214. (from symbol, to symbol, weight) tuples.
  215. It shall have a type of ``SHT_LLVM_CALL_GRAPH_PROFILE`` (0x6fff4c02), shall
  216. have the ``SHF_EXCLUDE`` flag set, the ``sh_link`` member shall hold the section
  217. header index of the associated symbol table, and shall have a ``sh_entsize`` of
  218. 16. It should be named ``.llvm.call-graph-profile``.
  219. The contents of the section shall be a sequence of ``Elf_CGProfile`` entries.
  220. .. code-block:: c
  221. typedef struct {
  222. Elf_Word cgp_from;
  223. Elf_Word cgp_to;
  224. Elf_Xword cgp_weight;
  225. } Elf_CGProfile;
  226. cgp_from
  227. The symbol index of the source of the edge.
  228. cgp_to
  229. The symbol index of the destination of the edge.
  230. cgp_weight
  231. The weight of the edge.
  232. This is represented in assembly as:
  233. .. code-block:: gas
  234. .cg_profile from, to, 42
  235. ``.cg_profile`` directives are processed at the end of the file. It is an error
  236. if either ``from`` or ``to`` are undefined temporary symbols. If either symbol
  237. is a temporary symbol, then the section symbol is used instead. If either
  238. symbol is undefined, then that symbol is defined as if ``.weak symbol`` has been
  239. written at the end of the file. This forces the symbol to show up in the symbol
  240. table.
  241. ``SHT_LLVM_ADDRSIG`` Section (address-significance table)
  242. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  243. This section is used to mark symbols as address-significant, i.e. the address
  244. of the symbol is used in a comparison or leaks outside the translation unit. It
  245. has the same meaning as the absence of the LLVM attributes ``unnamed_addr``
  246. and ``local_unnamed_addr``.
  247. Any sections referred to by symbols that are not marked as address-significant
  248. in any object file may be safely merged by a linker without breaking the
  249. address uniqueness guarantee provided by the C and C++ language standards.
  250. The contents of the section are a sequence of ULEB128-encoded integers
  251. referring to the symbol table indexes of the address-significant symbols.
  252. There are two associated assembly directives:
  253. .. code-block:: gas
  254. .addrsig
  255. This instructs the assembler to emit an address-significance table. Without
  256. this directive, all symbols are considered address-significant.
  257. .. code-block:: gas
  258. .addrsig_sym sym
  259. This marks ``sym`` as address-significant.
  260. ``SHT_LLVM_SYMPART`` Section (symbol partition specification)
  261. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  262. This section is used to mark symbols with the `partition`_ that they
  263. belong to. An ``.llvm_sympart`` section consists of a null-terminated string
  264. specifying the name of the partition followed by a relocation referring to
  265. the symbol that belongs to the partition. It may be constructed as follows:
  266. .. code-block:: gas
  267. .section ".llvm_sympart","",@llvm_sympart
  268. .asciz "libpartition.so"
  269. .word symbol_in_partition
  270. .. _partition: https://lld.llvm.org/Partitions.html
  271. CodeView-Dependent
  272. ------------------
  273. ``.cv_file`` Directive
  274. ^^^^^^^^^^^^^^^^^^^^^^
  275. Syntax:
  276. ``.cv_file`` *FileNumber FileName* [ *checksum* ] [ *checksumkind* ]
  277. ``.cv_func_id`` Directive
  278. ^^^^^^^^^^^^^^^^^^^^^^^^^
  279. Introduces a function ID that can be used with ``.cv_loc``.
  280. Syntax:
  281. ``.cv_func_id`` *FunctionId*
  282. ``.cv_inline_site_id`` Directive
  283. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  284. Introduces a function ID that can be used with ``.cv_loc``. Includes
  285. ``inlined at`` source location information for use in the line table of the
  286. caller, whether the caller is a real function or another inlined call site.
  287. Syntax:
  288. ``.cv_inline_site_id`` *FunctionId* ``within`` *Function* ``inlined_at`` *FileNumber Line* [ *Colomn* ]
  289. ``.cv_loc`` Directive
  290. ^^^^^^^^^^^^^^^^^^^^^
  291. The first number is a file number, must have been previously assigned with a
  292. ``.file`` directive, the second number is the line number and optionally the
  293. third number is a column position (zero if not specified). The remaining
  294. optional items are ``.loc`` sub-directives.
  295. Syntax:
  296. ``.cv_loc`` *FunctionId FileNumber* [ *Line* ] [ *Column* ] [ *prologue_end* ] [ ``is_stmt`` *value* ]
  297. ``.cv_linetable`` Directive
  298. ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  299. Syntax:
  300. ``.cv_linetable`` *FunctionId* ``,`` *FunctionStart* ``,`` *FunctionEnd*
  301. ``.cv_inline_linetable`` Directive
  302. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  303. Syntax:
  304. ``.cv_inline_linetable`` *PrimaryFunctionId* ``,`` *FileNumber Line FunctionStart FunctionEnd*
  305. ``.cv_def_range`` Directive
  306. ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  307. The *GapStart* and *GapEnd* options may be repeated as needed.
  308. Syntax:
  309. ``.cv_def_range`` *RangeStart RangeEnd* [ *GapStart GapEnd* ] ``,`` *bytes*
  310. ``.cv_stringtable`` Directive
  311. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  312. ``.cv_filechecksums`` Directive
  313. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  314. ``.cv_filechecksumoffset`` Directive
  315. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  316. Syntax:
  317. ``.cv_filechecksumoffset`` *FileNumber*
  318. ``.cv_fpo_data`` Directive
  319. ^^^^^^^^^^^^^^^^^^^^^^^^^^
  320. Syntax:
  321. ``.cv_fpo_data`` *procsym*
  322. Target Specific Behaviour
  323. =========================
  324. X86
  325. ---
  326. Relocations
  327. ^^^^^^^^^^^
  328. ``@ABS8`` can be applied to symbols which appear as immediate operands to
  329. instructions that have an 8-bit immediate form for that operand. It causes
  330. the assembler to use the 8-bit form and an 8-bit relocation (e.g. ``R_386_8``
  331. or ``R_X86_64_8``) for the symbol.
  332. For example:
  333. .. code-block:: gas
  334. cmpq $foo@ABS8, %rdi
  335. This causes the assembler to select the form of the 64-bit ``cmpq`` instruction
  336. that takes an 8-bit immediate operand that is sign extended to 64 bits, as
  337. opposed to ``cmpq $foo, %rdi`` which takes a 32-bit immediate operand. This
  338. is also not the same as ``cmpb $foo, %dil``, which is an 8-bit comparison.
  339. Windows on ARM
  340. --------------
  341. Stack Probe Emission
  342. ^^^^^^^^^^^^^^^^^^^^
  343. The reference implementation (Microsoft Visual Studio 2012) emits stack probes
  344. in the following fashion:
  345. .. code-block:: gas
  346. movw r4, #constant
  347. bl __chkstk
  348. sub.w sp, sp, r4
  349. However, this has the limitation of 32 MiB (±16MiB). In order to accommodate
  350. larger binaries, LLVM supports the use of ``-mcode-model=large`` to allow a 4GiB
  351. range via a slight deviation. It will generate an indirect jump as follows:
  352. .. code-block:: gas
  353. movw r4, #constant
  354. movw r12, :lower16:__chkstk
  355. movt r12, :upper16:__chkstk
  356. blx r12
  357. sub.w sp, sp, r4
  358. Variable Length Arrays
  359. ^^^^^^^^^^^^^^^^^^^^^^
  360. The reference implementation (Microsoft Visual Studio 2012) does not permit the
  361. emission of Variable Length Arrays (VLAs).
  362. The Windows ARM Itanium ABI extends the base ABI by adding support for emitting
  363. a dynamic stack allocation. When emitting a variable stack allocation, a call
  364. to ``__chkstk`` is emitted unconditionally to ensure that guard pages are setup
  365. properly. The emission of this stack probe emission is handled similar to the
  366. standard stack probe emission.
  367. The MSVC environment does not emit code for VLAs currently.
  368. Windows on ARM64
  369. ----------------
  370. Stack Probe Emission
  371. ^^^^^^^^^^^^^^^^^^^^
  372. The reference implementation (Microsoft Visual Studio 2017) emits stack probes
  373. in the following fashion:
  374. .. code-block:: gas
  375. mov x15, #constant
  376. bl __chkstk
  377. sub sp, sp, x15, lsl #4
  378. However, this has the limitation of 256 MiB (±128MiB). In order to accommodate
  379. larger binaries, LLVM supports the use of ``-mcode-model=large`` to allow a 8GiB
  380. (±4GiB) range via a slight deviation. It will generate an indirect jump as
  381. follows:
  382. .. code-block:: gas
  383. mov x15, #constant
  384. adrp x16, __chkstk
  385. add x16, x16, :lo12:__chkstk
  386. blr x16
  387. sub sp, sp, x15, lsl #4