HowToBuildWithPGO.rst 7.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163
  1. =============================================================
  2. How To Build Clang and LLVM with Profile-Guided Optimizations
  3. =============================================================
  4. Introduction
  5. ============
  6. PGO (Profile-Guided Optimization) allows your compiler to better optimize code
  7. for how it actually runs. Users report that applying this to Clang and LLVM can
  8. decrease overall compile time by 20%.
  9. This guide walks you through how to build Clang with PGO, though it also applies
  10. to other subprojects, such as LLD.
  11. Using the script
  12. ================
  13. We have a script at ``utils/collect_and_build_with_pgo.py``. This script is
  14. tested on a few Linux flavors, and requires a checkout of LLVM, Clang, and
  15. compiler-rt. Despite the the name, it performs four clean builds of Clang, so it
  16. can take a while to run to completion. Please see the script's ``--help`` for
  17. more information on how to run it, and the different options available to you.
  18. If you want to get the most out of PGO for a particular use-case (e.g. compiling
  19. a specific large piece of software), please do read the section below on
  20. 'benchmark' selection.
  21. Please note that this script is only tested on a few Linux distros. Patches to
  22. add support for other platforms, as always, are highly appreciated. :)
  23. This script also supports a ``--dry-run`` option, which causes it to print
  24. important commands instead of running them.
  25. Selecting 'benchmarks'
  26. ======================
  27. PGO does best when the profiles gathered represent how the user plans to use the
  28. compiler. Notably, highly accurate profiles of llc building x86_64 code aren't
  29. incredibly helpful if you're going to be targeting ARM.
  30. By default, the script above does two things to get solid coverage. It:
  31. - runs all of Clang and LLVM's lit tests, and
  32. - uses the instrumented Clang to build Clang, LLVM, and all of the other
  33. LLVM subprojects available to it.
  34. Together, these should give you:
  35. - solid coverage of building C++,
  36. - good coverage of building C,
  37. - great coverage of running optimizations,
  38. - great coverage of the backend for your host's architecture, and
  39. - some coverage of other architectures (if other arches are supported backends).
  40. Altogether, this should cover a diverse set of uses for Clang and LLVM. If you
  41. have very specific needs (e.g. your compiler is meant to compile a large browser
  42. for four different platforms, or similar), you may want to do something else.
  43. This is configurable in the script itself.
  44. Building Clang with PGO
  45. =======================
  46. If you prefer to not use the script, this briefly goes over how to build
  47. Clang/LLVM with PGO.
  48. First, you should have at least LLVM, Clang, and compiler-rt checked out
  49. locally.
  50. Next, at a high level, you're going to need to do the following:
  51. 1. Build a standard Release Clang and the relevant libclang_rt.profile library
  52. 2. Build Clang using the Clang you built above, but with instrumentation
  53. 3. Use the instrumented Clang to generate profiles, which consists of two steps:
  54. - Running the instrumented Clang/LLVM/lld/etc. on tasks that represent how
  55. users will use said tools.
  56. - Using a tool to convert the "raw" profiles generated above into a single,
  57. final PGO profile.
  58. 4. Build a final release Clang (along with whatever other binaries you need)
  59. using the profile collected from your benchmark
  60. In more detailed steps:
  61. 1. Configure a Clang build as you normally would. It's highly recommended that
  62. you use the Release configuration for this, since it will be used to build
  63. another Clang. Because you need Clang and supporting libraries, you'll want
  64. to build the ``all`` target (e.g. ``ninja all`` or ``make -j4 all``).
  65. 2. Configure a Clang build as above, but add the following CMake args:
  66. - ``-DLLVM_BUILD_INSTRUMENTED=IR`` -- This causes us to build everything
  67. with instrumentation.
  68. - ``-DLLVM_BUILD_RUNTIME=No`` -- A few projects have bad interactions when
  69. built with profiling, and aren't necessary to build. This flag turns them
  70. off.
  71. - ``-DCMAKE_C_COMPILER=/path/to/stage1/clang`` - Use the Clang we built in
  72. step 1.
  73. - ``-DCMAKE_CXX_COMPILER=/path/to/stage1/clang++`` - Same as above.
  74. In this build directory, you simply need to build the ``clang`` target (and
  75. whatever supporting tooling your benchmark requires).
  76. 3. As mentioned above, this has two steps: gathering profile data, and then
  77. massaging it into a useful form:
  78. a. Build your benchmark using the Clang generated in step 2. The 'standard'
  79. benchmark recommended is to run ``check-clang`` and ``check-llvm`` in your
  80. instrumented Clang's build directory, and to do a full build of Clang/LLVM
  81. using your instrumented Clang. So, create yet another build directory,
  82. with the following CMake arguments:
  83. - ``-DCMAKE_C_COMPILER=/path/to/stage2/clang`` - Use the Clang we built in
  84. step 2.
  85. - ``-DCMAKE_CXX_COMPILER=/path/to/stage2/clang++`` - Same as above.
  86. If your users are fans of debug info, you may want to consider using
  87. ``-DCMAKE_BUILD_TYPE=RelWithDebInfo`` instead of
  88. ``-DCMAKE_BUILD_TYPE=Release``. This will grant better coverage of
  89. debug info pieces of clang, but will take longer to complete and will
  90. result in a much larger build directory.
  91. It's recommended to build the ``all`` target with your instrumented Clang,
  92. since more coverage is often better.
  93. b. You should now have a few ``*.profraw`` files in
  94. ``path/to/stage2/profiles/``. You need to merge these using
  95. ``llvm-profdata`` (even if you only have one! The profile merge transforms
  96. profraw into actual profile data, as well). This can be done with
  97. ``/path/to/stage1/llvm-profdata merge
  98. -output=/path/to/output/profdata.prof path/to/stage2/profiles/*.profraw``.
  99. 4. Now, build your final, PGO-optimized Clang. To do this, you'll want to pass
  100. the following additional arguments to CMake.
  101. - ``-DLLVM_PROFDATA_FILE=/path/to/output/profdata.prof`` - Use the PGO
  102. profile from the previous step.
  103. - ``-DCMAKE_C_COMPILER=/path/to/stage1/clang`` - Use the Clang we built in
  104. step 1.
  105. - ``-DCMAKE_CXX_COMPILER=/path/to/stage1/clang++`` - Same as above.
  106. From here, you can build whatever targets you need.
  107. .. note::
  108. You may see warnings about a mismatched profile in the build output. These
  109. are generally harmless. To silence them, you can add
  110. ``-DCMAKE_C_FLAGS='-Wno-backend-plugin'
  111. -DCMAKE_CXX_FLAGS='-Wno-backend-plugin'`` to your CMake invocation.
  112. Congrats! You now have a Clang built with profile-guided optimizations, and you
  113. can delete all but the final build directory if you'd like.
  114. If this worked well for you and you plan on doing it often, there's a slight
  115. optimization that can be made: LLVM and Clang have a tool called tblgen that's
  116. built and run during the build process. While it's potentially nice to build
  117. this for coverage as part of step 3, none of your other builds should benefit
  118. from building it. You can pass the CMake options
  119. ``-DCLANG_TABLEGEN=/path/to/stage1/bin/clang-tblgen
  120. -DLLVM_TABLEGEN=/path/to/stage1/bin/llvm-tblgen`` to steps 2 and onward to avoid
  121. these useless rebuilds.