123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352 |
- ==========================
- UndefinedBehaviorSanitizer
- ==========================
- .. contents::
- :local:
- Introduction
- ============
- UndefinedBehaviorSanitizer (UBSan) is a fast undefined behavior detector.
- UBSan modifies the program at compile-time to catch various kinds of undefined
- behavior during program execution, for example:
- * Using misaligned or null pointer
- * Signed integer overflow
- * Conversion to, from, or between floating-point types which would
- overflow the destination
- See the full list of available :ref:`checks <ubsan-checks>` below.
- UBSan has an optional run-time library which provides better error reporting.
- The checks have small runtime cost and no impact on address space layout or ABI.
- How to build
- ============
- Build LLVM/Clang with `CMake <https://llvm.org/docs/CMake.html>`_.
- Usage
- =====
- Use ``clang++`` to compile and link your program with ``-fsanitize=undefined``
- flag. Make sure to use ``clang++`` (not ``ld``) as a linker, so that your
- executable is linked with proper UBSan runtime libraries. You can use ``clang``
- instead of ``clang++`` if you're compiling/linking C code.
- .. code-block:: console
- % cat test.cc
- int main(int argc, char **argv) {
- int k = 0x7fffffff;
- k += argc;
- return 0;
- }
- % clang++ -fsanitize=undefined test.cc
- % ./a.out
- test.cc:3:5: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
- You can enable only a subset of :ref:`checks <ubsan-checks>` offered by UBSan,
- and define the desired behavior for each kind of check:
- * ``-fsanitize=...``: print a verbose error report and continue execution (default);
- * ``-fno-sanitize-recover=...``: print a verbose error report and exit the program;
- * ``-fsanitize-trap=...``: execute a trap instruction (doesn't require UBSan run-time support).
- For example if you compile/link your program as:
- .. code-block:: console
- % clang++ -fsanitize=signed-integer-overflow,null,alignment -fno-sanitize-recover=null -fsanitize-trap=alignment
- the program will continue execution after signed integer overflows, exit after
- the first invalid use of a null pointer, and trap after the first use of misaligned
- pointer.
- .. _ubsan-checks:
- Available checks
- ================
- Available checks are:
- - ``-fsanitize=alignment``: Use of a misaligned pointer or creation
- of a misaligned reference. Also sanitizes assume_aligned-like attributes.
- - ``-fsanitize=bool``: Load of a ``bool`` value which is neither
- ``true`` nor ``false``.
- - ``-fsanitize=builtin``: Passing invalid values to compiler builtins.
- - ``-fsanitize=bounds``: Out of bounds array indexing, in cases
- where the array bound can be statically determined.
- - ``-fsanitize=enum``: Load of a value of an enumerated type which
- is not in the range of representable values for that enumerated
- type.
- - ``-fsanitize=float-cast-overflow``: Conversion to, from, or
- between floating-point types which would overflow the
- destination. Because the range of representable values for all
- floating-point types supported by Clang is [-inf, +inf], the only
- cases detected are conversions from floating point to integer types.
- - ``-fsanitize=float-divide-by-zero``: Floating point division by
- zero. This is undefined per the C and C++ standards, but is defined
- by Clang (and by ISO/IEC/IEEE 60559 / IEEE 754) as producing either an
- infinity or NaN value, so is not included in ``-fsanitize=undefined``.
- - ``-fsanitize=function``: Indirect call of a function through a
- function pointer of the wrong type (Darwin/Linux, C++ and x86/x86_64
- only).
- - ``-fsanitize=implicit-unsigned-integer-truncation``,
- ``-fsanitize=implicit-signed-integer-truncation``: Implicit conversion from
- integer of larger bit width to smaller bit width, if that results in data
- loss. That is, if the demoted value, after casting back to the original
- width, is not equal to the original value before the downcast.
- The ``-fsanitize=implicit-unsigned-integer-truncation`` handles conversions
- between two ``unsigned`` types, while
- ``-fsanitize=implicit-signed-integer-truncation`` handles the rest of the
- conversions - when either one, or both of the types are signed.
- Issues caught by these sanitizers are not undefined behavior,
- but are often unintentional.
- - ``-fsanitize=implicit-integer-sign-change``: Implicit conversion between
- integer types, if that changes the sign of the value. That is, if the the
- original value was negative and the new value is positive (or zero),
- or the original value was positive, and the new value is negative.
- Issues caught by this sanitizer are not undefined behavior,
- but are often unintentional.
- - ``-fsanitize=integer-divide-by-zero``: Integer division by zero.
- - ``-fsanitize=nonnull-attribute``: Passing null pointer as a function
- parameter which is declared to never be null.
- - ``-fsanitize=null``: Use of a null pointer or creation of a null
- reference.
- - ``-fsanitize=nullability-arg``: Passing null as a function parameter
- which is annotated with ``_Nonnull``.
- - ``-fsanitize=nullability-assign``: Assigning null to an lvalue which
- is annotated with ``_Nonnull``.
- - ``-fsanitize=nullability-return``: Returning null from a function with
- a return type annotated with ``_Nonnull``.
- - ``-fsanitize=object-size``: An attempt to potentially use bytes which
- the optimizer can determine are not part of the object being accessed.
- This will also detect some types of undefined behavior that may not
- directly access memory, but are provably incorrect given the size of
- the objects involved, such as invalid downcasts and calling methods on
- invalid pointers. These checks are made in terms of
- ``__builtin_object_size``, and consequently may be able to detect more
- problems at higher optimization levels.
- - ``-fsanitize=pointer-overflow``: Performing pointer arithmetic which
- overflows, or where either the old or new pointer value is a null pointer
- (or in C, when they both are).
- - ``-fsanitize=return``: In C++, reaching the end of a
- value-returning function without returning a value.
- - ``-fsanitize=returns-nonnull-attribute``: Returning null pointer
- from a function which is declared to never return null.
- - ``-fsanitize=shift``: Shift operators where the amount shifted is
- greater or equal to the promoted bit-width of the left hand side
- or less than zero, or where the left hand side is negative. For a
- signed left shift, also checks for signed overflow in C, and for
- unsigned overflow in C++. You can use ``-fsanitize=shift-base`` or
- ``-fsanitize=shift-exponent`` to check only left-hand side or
- right-hand side of shift operation, respectively.
- - ``-fsanitize=signed-integer-overflow``: Signed integer overflow, where the
- result of a signed integer computation cannot be represented in its type.
- This includes all the checks covered by ``-ftrapv``, as well as checks for
- signed division overflow (``INT_MIN/-1``), but not checks for
- lossy implicit conversions performed before the computation
- (see ``-fsanitize=implicit-conversion``). Both of these two issues are
- handled by ``-fsanitize=implicit-conversion`` group of checks.
- - ``-fsanitize=unreachable``: If control flow reaches an unreachable
- program point.
- - ``-fsanitize=unsigned-integer-overflow``: Unsigned integer overflow, where
- the result of an unsigned integer computation cannot be represented in its
- type. Unlike signed integer overflow, this is not undefined behavior, but
- it is often unintentional. This sanitizer does not check for lossy implicit
- conversions performed before such a computation
- (see ``-fsanitize=implicit-conversion``).
- - ``-fsanitize=vla-bound``: A variable-length array whose bound
- does not evaluate to a positive value.
- - ``-fsanitize=vptr``: Use of an object whose vptr indicates that it is of
- the wrong dynamic type, or that its lifetime has not begun or has ended.
- Incompatible with ``-fno-rtti``. Link must be performed by ``clang++``, not
- ``clang``, to make sure C++-specific parts of the runtime library and C++
- standard libraries are present.
- You can also use the following check groups:
- - ``-fsanitize=undefined``: All of the checks listed above other than
- ``float-divide-by-zero``, ``unsigned-integer-overflow``,
- ``implicit-conversion``, and the ``nullability-*`` group of checks.
- - ``-fsanitize=undefined-trap``: Deprecated alias of
- ``-fsanitize=undefined``.
- - ``-fsanitize=implicit-integer-truncation``: Catches lossy integral
- conversions. Enables ``implicit-signed-integer-truncation`` and
- ``implicit-unsigned-integer-truncation``.
- - ``-fsanitize=implicit-integer-arithmetic-value-change``: Catches implicit
- conversions that change the arithmetic value of the integer. Enables
- ``implicit-signed-integer-truncation`` and ``implicit-integer-sign-change``.
- - ``-fsanitize=implicit-conversion``: Checks for suspicious
- behavior of implicit conversions. Enables
- ``implicit-unsigned-integer-truncation``,
- ``implicit-signed-integer-truncation``, and
- ``implicit-integer-sign-change``.
- - ``-fsanitize=integer``: Checks for undefined or suspicious integer
- behavior (e.g. unsigned integer overflow).
- Enables ``signed-integer-overflow``, ``unsigned-integer-overflow``,
- ``shift``, ``integer-divide-by-zero``,
- ``implicit-unsigned-integer-truncation``,
- ``implicit-signed-integer-truncation``, and
- ``implicit-integer-sign-change``.
- - ``-fsanitize=nullability``: Enables ``nullability-arg``,
- ``nullability-assign``, and ``nullability-return``. While violating
- nullability does not have undefined behavior, it is often unintentional,
- so UBSan offers to catch it.
- Volatile
- --------
- The ``null``, ``alignment``, ``object-size``, and ``vptr`` checks do not apply
- to pointers to types with the ``volatile`` qualifier.
- Minimal Runtime
- ===============
- There is a minimal UBSan runtime available suitable for use in production
- environments. This runtime has a small attack surface. It only provides very
- basic issue logging and deduplication, and does not support
- ``-fsanitize=function`` and ``-fsanitize=vptr`` checking.
- To use the minimal runtime, add ``-fsanitize-minimal-runtime`` to the clang
- command line options. For example, if you're used to compiling with
- ``-fsanitize=undefined``, you could enable the minimal runtime with
- ``-fsanitize=undefined -fsanitize-minimal-runtime``.
- Stack traces and report symbolization
- =====================================
- If you want UBSan to print symbolized stack trace for each error report, you
- will need to:
- #. Compile with ``-g`` and ``-fno-omit-frame-pointer`` to get proper debug
- information in your binary.
- #. Run your program with environment variable
- ``UBSAN_OPTIONS=print_stacktrace=1``.
- #. Make sure ``llvm-symbolizer`` binary is in ``PATH``.
- Logging
- =======
- The default log file for diagnostics is "stderr". To log diagnostics to another
- file, you can set ``UBSAN_OPTIONS=log_path=...``.
- Silencing Unsigned Integer Overflow
- ===================================
- To silence reports from unsigned integer overflow, you can set
- ``UBSAN_OPTIONS=silence_unsigned_overflow=1``. This feature, combined with
- ``-fsanitize-recover=unsigned-integer-overflow``, is particularly useful for
- providing fuzzing signal without blowing up logs.
- Issue Suppression
- =================
- UndefinedBehaviorSanitizer is not expected to produce false positives.
- If you see one, look again; most likely it is a true positive!
- Disabling Instrumentation with ``__attribute__((no_sanitize("undefined")))``
- ----------------------------------------------------------------------------
- You disable UBSan checks for particular functions with
- ``__attribute__((no_sanitize("undefined")))``. You can use all values of
- ``-fsanitize=`` flag in this attribute, e.g. if your function deliberately
- contains possible signed integer overflow, you can use
- ``__attribute__((no_sanitize("signed-integer-overflow")))``.
- This attribute may not be
- supported by other compilers, so consider using it together with
- ``#if defined(__clang__)``.
- Suppressing Errors in Recompiled Code (Blacklist)
- -------------------------------------------------
- UndefinedBehaviorSanitizer supports ``src`` and ``fun`` entity types in
- :doc:`SanitizerSpecialCaseList`, that can be used to suppress error reports
- in the specified source files or functions.
- Runtime suppressions
- --------------------
- Sometimes you can suppress UBSan error reports for specific files, functions,
- or libraries without recompiling the code. You need to pass a path to
- suppression file in a ``UBSAN_OPTIONS`` environment variable.
- .. code-block:: bash
- UBSAN_OPTIONS=suppressions=MyUBSan.supp
- You need to specify a :ref:`check <ubsan-checks>` you are suppressing and the
- bug location. For example:
- .. code-block:: bash
- signed-integer-overflow:file-with-known-overflow.cpp
- alignment:function_doing_unaligned_access
- vptr:shared_object_with_vptr_failures.so
- There are several limitations:
- * Sometimes your binary must have enough debug info and/or symbol table, so
- that the runtime could figure out source file or function name to match
- against the suppression.
- * It is only possible to suppress recoverable checks. For the example above,
- you can additionally pass
- ``-fsanitize-recover=signed-integer-overflow,alignment,vptr``, although
- most of UBSan checks are recoverable by default.
- * Check groups (like ``undefined``) can't be used in suppressions file, only
- fine-grained checks are supported.
- Supported Platforms
- ===================
- UndefinedBehaviorSanitizer is supported on the following operating systems:
- * Android
- * Linux
- * NetBSD
- * FreeBSD
- * OpenBSD
- * macOS
- * Windows
- The runtime library is relatively portable and platform independent. If the OS
- you need is not listed above, UndefinedBehaviorSanitizer may already work for
- it, or could be made to work with a minor porting effort.
- Current Status
- ==============
- UndefinedBehaviorSanitizer is available on selected platforms starting from LLVM
- 3.3. The test suite is integrated into the CMake build and can be run with
- ``check-ubsan`` command.
- Additional Configuration
- ========================
- UndefinedBehaviorSanitizer adds static check data for each check unless it is
- in trap mode. This check data includes the full file name. The option
- ``-fsanitize-undefined-strip-path-components=N`` can be used to trim this
- information. If ``N`` is positive, file information emitted by
- UndefinedBehaviorSanitizer will drop the first ``N`` components from the file
- path. If ``N`` is negative, the last ``N`` components will be kept.
- Example
- -------
- For a file called ``/code/library/file.cpp``, here is what would be emitted:
- * Default (No flag, or ``-fsanitize-undefined-strip-path-components=0``): ``/code/library/file.cpp``
- * ``-fsanitize-undefined-strip-path-components=1``: ``code/library/file.cpp``
- * ``-fsanitize-undefined-strip-path-components=2``: ``library/file.cpp``
- * ``-fsanitize-undefined-strip-path-components=-1``: ``file.cpp``
- * ``-fsanitize-undefined-strip-path-components=-2``: ``library/file.cpp``
- More Information
- ================
- * From LLVM project blog:
- `What Every C Programmer Should Know About Undefined Behavior
- <http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html>`_
- * From John Regehr's *Embedded in Academia* blog:
- `A Guide to Undefined Behavior in C and C++
- <https://blog.regehr.org/archives/213>`_
|