ExtendedIntegerResults.txt 4.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133
  1. //===----------------------------------------------------------------------===//
  2. // Representing sign/zero extension of function results
  3. //===----------------------------------------------------------------------===//
  4. Mar 25, 2009 - Initial Revision
  5. Most ABIs specify that functions which return small integers do so in a
  6. specific integer GPR. This is an efficient way to go, but raises the question:
  7. if the returned value is smaller than the register, what do the high bits hold?
  8. There are three (interesting) possible answers: undefined, zero extended, or
  9. sign extended. The number of bits in question depends on the data-type that
  10. the front-end is referencing (typically i1/i8/i16/i32).
  11. Knowing the answer to this is important for two reasons: 1) we want to be able
  12. to implement the ABI correctly. If we need to sign extend the result according
  13. to the ABI, we really really do need to do this to preserve correctness. 2)
  14. this information is often useful for optimization purposes, and we want the
  15. mid-level optimizers to be able to process this (e.g. eliminate redundant
  16. extensions).
  17. For example, lets pretend that X86 requires the caller to properly extend the
  18. result of a return (I'm not sure this is the case, but the argument doesn't
  19. depend on this). Given this, we should compile this:
  20. int a();
  21. short b() { return a(); }
  22. into:
  23. _b:
  24. subl $12, %esp
  25. call L_a$stub
  26. addl $12, %esp
  27. cwtl
  28. ret
  29. An optimization example is that we should be able to eliminate the explicit
  30. sign extension in this example:
  31. short y();
  32. int z() {
  33. return ((int)y() << 16) >> 16;
  34. }
  35. _z:
  36. subl $12, %esp
  37. call _y
  38. ;; movswl %ax, %eax -> not needed because eax is already sext'd
  39. addl $12, %esp
  40. ret
  41. //===----------------------------------------------------------------------===//
  42. // What we have right now.
  43. //===----------------------------------------------------------------------===//
  44. Currently, these sorts of things are modelled by compiling a function to return
  45. the small type and a signext/zeroext marker is used. For example, we compile
  46. Z into:
  47. define i32 @z() nounwind {
  48. entry:
  49. %0 = tail call signext i16 (...)* @y() nounwind
  50. %1 = sext i16 %0 to i32
  51. ret i32 %1
  52. }
  53. and b into:
  54. define signext i16 @b() nounwind {
  55. entry:
  56. %0 = tail call i32 (...)* @a() nounwind ; <i32> [#uses=1]
  57. %retval12 = trunc i32 %0 to i16 ; <i16> [#uses=1]
  58. ret i16 %retval12
  59. }
  60. This has some problems: 1) the actual precise semantics are really poorly
  61. defined (see PR3779). 2) some targets might want the caller to extend, some
  62. might want the callee to extend 3) the mid-level optimizer doesn't know the
  63. size of the GPR, so it doesn't know that %0 is sign extended up to 32-bits
  64. here, and even if it did, it could not eliminate the sext. 4) the code
  65. generator has historically assumed that the result is extended to i32, which is
  66. a problem on PIC16 (and is also probably wrong on alpha and other 64-bit
  67. targets).
  68. //===----------------------------------------------------------------------===//
  69. // The proposal
  70. //===----------------------------------------------------------------------===//
  71. I suggest that we have the front-end fully lower out the ABI issues here to
  72. LLVM IR. This makes it 100% explicit what is going on and means that there is
  73. no cause for confusion. For example, the cases above should compile into:
  74. define i32 @z() nounwind {
  75. entry:
  76. %0 = tail call i32 (...)* @y() nounwind
  77. %1 = trunc i32 %0 to i16
  78. %2 = sext i16 %1 to i32
  79. ret i32 %2
  80. }
  81. define i32 @b() nounwind {
  82. entry:
  83. %0 = tail call i32 (...)* @a() nounwind
  84. %retval12 = trunc i32 %0 to i16
  85. %tmp = sext i16 %retval12 to i32
  86. ret i32 %tmp
  87. }
  88. In this model, no functions will return an i1/i8/i16 (and on a x86-64 target
  89. that extends results to i64, no i32). This solves the ambiguity issue, allows us
  90. to fully describe all possible ABIs, and now allows the optimizers to reason
  91. about and eliminate these extensions.
  92. The one thing that is missing is the ability for the front-end and optimizer to
  93. specify/infer the guarantees provided by the ABI to allow other optimizations.
  94. For example, in the y/z case, since y is known to return a sign extended value,
  95. the trunc/sext in z should be eliminable.
  96. This can be done by introducing new sext/zext attributes which mean "I know
  97. that the result of the function is sign extended at least N bits. Given this,
  98. and given that it is stuck on the y function, the mid-level optimizer could
  99. easily eliminate the extensions etc with existing functionality.
  100. The major disadvantage of doing this sort of thing is that it makes the ABI
  101. lowering stuff even more explicit in the front-end, and that we would like to
  102. eventually move to having the code generator do more of this work. However,
  103. the sad truth of the matter is that this is a) unlikely to happen anytime in
  104. the near future, and b) this is no worse than we have now with the existing
  105. attributes.
  106. C compilers fundamentally have to reason about the target in many ways.
  107. This is ugly and horrible, but a fact of life.