123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280 |
- ================================
- Fuzzing LLVM libraries and tools
- ================================
- .. contents::
- :local:
- :depth: 2
- Introduction
- ============
- The LLVM tree includes a number of fuzzers for various components. These are
- built on top of :doc:`LibFuzzer <LibFuzzer>`. In order to build and run these
- fuzzers, see :ref:`building-fuzzers`.
- Available Fuzzers
- =================
- clang-fuzzer
- ------------
- A |generic fuzzer| that tries to compile textual input as C++ code. Some of the
- bugs this fuzzer has reported are `on bugzilla`__ and `on OSS Fuzz's
- tracker`__.
- __ https://llvm.org/pr23057
- __ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-fuzzer
- clang-proto-fuzzer
- ------------------
- A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf
- class that describes a subset of the C++ language.
- This fuzzer accepts clang command line options after `ignore_remaining_args=1`.
- For example, the following command will fuzz clang with a higher optimization
- level:
- .. code-block:: shell
- % bin/clang-proto-fuzzer <corpus-dir> -ignore_remaining_args=1 -O3
- clang-format-fuzzer
- -------------------
- A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the
- bugs this fuzzer has reported are `on bugzilla`__
- and `on OSS Fuzz's tracker`__.
- .. _clang-format: https://clang.llvm.org/docs/ClangFormat.html
- __ https://llvm.org/pr23052
- __ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-format-fuzzer
- llvm-as-fuzzer
- --------------
- A |generic fuzzer| that tries to parse text as :doc:`LLVM assembly <LangRef>`.
- Some of the bugs this fuzzer has reported are `on bugzilla`__.
- __ https://llvm.org/pr24639
- llvm-dwarfdump-fuzzer
- ---------------------
- A |generic fuzzer| that interprets inputs as object files and runs
- :doc:`llvm-dwarfdump <CommandGuide/llvm-dwarfdump>` on them. Some of the bugs
- this fuzzer has reported are `on OSS Fuzz's tracker`__
- __ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+llvm-dwarfdump-fuzzer
- llvm-demangle-fuzzer
- ---------------------
- A |generic fuzzer| for the Itanium demangler used in various LLVM tools. We've
- fuzzed __cxa_demangle to death, why not fuzz LLVM's implementation of the same
- function!
- llvm-isel-fuzzer
- ----------------
- A |LLVM IR fuzzer| aimed at finding bugs in instruction selection.
- This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match
- those of :doc:`llc <CommandGuide/llc>` and the triple is required. For example,
- the following command would fuzz AArch64 with :doc:`GlobalISel`:
- .. code-block:: shell
- % bin/llvm-isel-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0
- Some flags can also be specified in the binary name itself in order to support
- OSS Fuzz, which has trouble with required arguments. To do this, you can copy
- or move ``llvm-isel-fuzzer`` to ``llvm-isel-fuzzer--x-y-z``, separating options
- from the binary name using "--". The valid options are architecture names
- (``aarch64``, ``x86_64``), optimization levels (``O0``, ``O2``), or specific
- keywords, like ``gisel`` for enabling global instruction selection. In this
- mode, the same example could be run like so:
- .. code-block:: shell
- % bin/llvm-isel-fuzzer--aarch64-O0-gisel <corpus-dir>
- llvm-opt-fuzzer
- ---------------
- A |LLVM IR fuzzer| aimed at finding bugs in optimization passes.
- It receives optimzation pipeline and runs it for each fuzzer input.
- Interface of this fuzzer almost directly mirrors ``llvm-isel-fuzzer``. Both
- ``mtriple`` and ``passes`` arguments are required. Passes are specified in a
- format suitable for the new pass manager. You can find some documentation about
- this format in the doxygen for ``PassBuilder::parsePassPipeline``.
- .. code-block:: shell
- % bin/llvm-opt-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple x86_64 -passes instcombine
- Similarly to the ``llvm-isel-fuzzer`` arguments in some predefined configurations
- might be embedded directly into the binary file name:
- .. code-block:: shell
- % bin/llvm-opt-fuzzer--x86_64-instcombine <corpus-dir>
- llvm-mc-assemble-fuzzer
- -----------------------
- A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as
- target specific assembly.
- Note that this fuzzer has an unusual command line interface which is not fully
- compatible with all of libFuzzer's features. Fuzzer arguments must be passed
- after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For
- example, to fuzz the AArch64 assembler you might use the following command:
- .. code-block:: console
- llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4
- This scheme will likely change in the future.
- llvm-mc-disassemble-fuzzer
- --------------------------
- A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs
- as assembled binary data.
- Note that this fuzzer has an unusual command line interface which is not fully
- compatible with all of libFuzzer's features. See the notes above about
- ``llvm-mc-assemble-fuzzer`` for details.
- .. |generic fuzzer| replace:: :ref:`generic fuzzer <fuzzing-llvm-generic>`
- .. |protobuf fuzzer|
- replace:: :ref:`libprotobuf-mutator based fuzzer <fuzzing-llvm-protobuf>`
- .. |LLVM IR fuzzer|
- replace:: :ref:`structured LLVM IR fuzzer <fuzzing-llvm-ir>`
- Mutators and Input Generators
- =============================
- The inputs for a fuzz target are generated via random mutations of a
- :ref:`corpus <libfuzzer-corpus>`. There are a few options for the kinds of
- mutations that a fuzzer in LLVM might want.
- .. _fuzzing-llvm-generic:
- Generic Random Fuzzing
- ----------------------
- The most basic form of input mutation is to use the built in mutators of
- LibFuzzer. These simply treat the input corpus as a bag of bits and make random
- mutations. This type of fuzzer is good for stressing the surface layers of a
- program, and is good at testing things like lexers, parsers, or binary
- protocols.
- Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_,
- `clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_,
- `llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_.
- .. _fuzzing-llvm-protobuf:
- Structured Fuzzing using ``libprotobuf-mutator``
- ------------------------------------------------
- We can use libprotobuf-mutator_ in order to perform structured fuzzing and
- stress deeper layers of programs. This works by defining a protobuf class that
- translates arbitrary data into structurally interesting input. Specifically, we
- use this to work with a subset of the C++ language and perform mutations that
- produce valid C++ programs in order to exercise parts of clang that are more
- interesting than parser error handling.
- To build this kind of fuzzer you need `protobuf`_ and its dependencies
- installed, and you need to specify some extra flags when configuring the build
- with :doc:`CMake <CMake>`. For example, `clang-proto-fuzzer`_ can be enabled by
- adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in
- :ref:`building-fuzzers`.
- The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is
- `clang-proto-fuzzer`_.
- .. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator
- .. _protobuf: https://github.com/google/protobuf
- .. _fuzzing-llvm-ir:
- Structured Fuzzing of LLVM IR
- -----------------------------
- We also use a more direct form of structured fuzzing for fuzzers that take
- :doc:`LLVM IR <LangRef>` as input. This is achieved through the ``FuzzMutate``
- library, which was `discussed at EuroLLVM 2017`_.
- The ``FuzzMutate`` library is used to structurally fuzz backends in
- `llvm-isel-fuzzer`_.
- .. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg
- Building and Running
- ====================
- .. _building-fuzzers:
- Configuring LLVM to Build Fuzzers
- ---------------------------------
- Fuzzers will be built and linked to libFuzzer by default as long as you build
- LLVM with sanitizer coverage enabled. You would typically also enable at least
- one sanitizer to find bugs faster. The most common way to build the fuzzers is
- by adding the following two flags to your CMake invocation:
- ``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``.
- .. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building
- with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off``
- to avoid building the sanitizers themselves with sanitizers enabled.
- .. note:: You may run into issues if you build with BFD ld, which is the
- default linker on many unix systems. These issues are being tracked
- in https://llvm.org/PR34636.
- Continuously Running and Finding Bugs
- -------------------------------------
- There used to be a public buildbot running LLVM fuzzers continuously, and while
- this did find issues, it didn't have a very good way to report problems in an
- actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more
- instead.
- You can browse the `LLVM project issue list`_ for the bugs found by
- `LLVM on OSS Fuzz`_. These are also mailed to the `llvm-bugs mailing
- list`_.
- .. _OSS Fuzz: https://github.com/google/oss-fuzz
- .. _LLVM project issue list:
- https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm
- .. _LLVM on OSS Fuzz:
- https://github.com/google/oss-fuzz/blob/master/projects/llvm
- .. _llvm-bugs mailing list:
- http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
- Utilities for Writing Fuzzers
- =============================
- There are some utilities available for writing fuzzers in LLVM.
- Some helpers for handling the command line interface are available in
- ``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command
- line options in a consistent way and to implement standalone main functions so
- your fuzzer can be built and tested when not built against libFuzzer.
- There is also some handling of the CMake config for fuzzers, where you should
- use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works
- similarly to functions such as ``add_llvm_tool``, but they take care of linking
- to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to
- enable standalone testing.
|