123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394 |
- =================
- SanitizerCoverage
- =================
- .. contents::
- :local:
- Introduction
- ============
- LLVM has a simple code coverage instrumentation built in (SanitizerCoverage).
- It inserts calls to user-defined functions on function-, basic-block-, and edge- levels.
- Default implementations of those callbacks are provided and implement
- simple coverage reporting and visualization,
- however if you need *just* coverage visualization you may want to use
- :doc:`SourceBasedCodeCoverage <SourceBasedCodeCoverage>` instead.
- Tracing PCs with guards
- =======================
- With ``-fsanitize-coverage=trace-pc-guard`` the compiler will insert the following code
- on every edge:
- .. code-block:: none
- __sanitizer_cov_trace_pc_guard(&guard_variable)
- Every edge will have its own `guard_variable` (uint32_t).
- The compler will also insert calls to a module constructor:
- .. code-block:: c++
- // The guards are [start, stop).
- // This function will be called at least once per DSO and may be called
- // more than once with the same values of start/stop.
- __sanitizer_cov_trace_pc_guard_init(uint32_t *start, uint32_t *stop);
- With an additional ``...=trace-pc,indirect-calls`` flag
- ``__sanitizer_cov_trace_pc_indirect(void *callee)`` will be inserted on every indirect call.
- The functions `__sanitizer_cov_trace_pc_*` should be defined by the user.
- Example:
- .. code-block:: c++
- // trace-pc-guard-cb.cc
- #include <stdint.h>
- #include <stdio.h>
- #include <sanitizer/coverage_interface.h>
- // This callback is inserted by the compiler as a module constructor
- // into every DSO. 'start' and 'stop' correspond to the
- // beginning and end of the section with the guards for the entire
- // binary (executable or DSO). The callback will be called at least
- // once per DSO and may be called multiple times with the same parameters.
- extern "C" void __sanitizer_cov_trace_pc_guard_init(uint32_t *start,
- uint32_t *stop) {
- static uint64_t N; // Counter for the guards.
- if (start == stop || *start) return; // Initialize only once.
- printf("INIT: %p %p\n", start, stop);
- for (uint32_t *x = start; x < stop; x++)
- *x = ++N; // Guards should start from 1.
- }
- // This callback is inserted by the compiler on every edge in the
- // control flow (some optimizations apply).
- // Typically, the compiler will emit the code like this:
- // if(*guard)
- // __sanitizer_cov_trace_pc_guard(guard);
- // But for large functions it will emit a simple call:
- // __sanitizer_cov_trace_pc_guard(guard);
- extern "C" void __sanitizer_cov_trace_pc_guard(uint32_t *guard) {
- if (!*guard) return; // Duplicate the guard check.
- // If you set *guard to 0 this code will not be called again for this edge.
- // Now you can get the PC and do whatever you want:
- // store it somewhere or symbolize it and print right away.
- // The values of `*guard` are as you set them in
- // __sanitizer_cov_trace_pc_guard_init and so you can make them consecutive
- // and use them to dereference an array or a bit vector.
- void *PC = __builtin_return_address(0);
- char PcDescr[1024];
- // This function is a part of the sanitizer run-time.
- // To use it, link with AddressSanitizer or other sanitizer.
- __sanitizer_symbolize_pc(PC, "%p %F %L", PcDescr, sizeof(PcDescr));
- printf("guard: %p %x PC %s\n", guard, *guard, PcDescr);
- }
- .. code-block:: c++
- // trace-pc-guard-example.cc
- void foo() { }
- int main(int argc, char **argv) {
- if (argc > 1) foo();
- }
- .. code-block:: console
-
- clang++ -g -fsanitize-coverage=trace-pc-guard trace-pc-guard-example.cc -c
- clang++ trace-pc-guard-cb.cc trace-pc-guard-example.o -fsanitize=address
- ASAN_OPTIONS=strip_path_prefix=`pwd`/ ./a.out
- .. code-block:: console
- INIT: 0x71bcd0 0x71bce0
- guard: 0x71bcd4 2 PC 0x4ecd5b in main trace-pc-guard-example.cc:2
- guard: 0x71bcd8 3 PC 0x4ecd9e in main trace-pc-guard-example.cc:3:7
- .. code-block:: console
- ASAN_OPTIONS=strip_path_prefix=`pwd`/ ./a.out with-foo
- .. code-block:: console
- INIT: 0x71bcd0 0x71bce0
- guard: 0x71bcd4 2 PC 0x4ecd5b in main trace-pc-guard-example.cc:3
- guard: 0x71bcdc 4 PC 0x4ecdc7 in main trace-pc-guard-example.cc:4:17
- guard: 0x71bcd0 1 PC 0x4ecd20 in foo() trace-pc-guard-example.cc:2:14
- Inline 8bit-counters
- ====================
- **Experimental, may change or disappear in future**
- With ``-fsanitize-coverage=inline-8bit-counters`` the compiler will insert
- inline counter increments on every edge.
- This is similar to ``-fsanitize-coverage=trace-pc-guard`` but instead of a
- callback the instrumentation simply increments a counter.
- Users need to implement a single function to capture the counters at startup.
- .. code-block:: c++
- extern "C"
- void __sanitizer_cov_8bit_counters_init(char *start, char *end) {
- // [start,end) is the array of 8-bit counters created for the current DSO.
- // Capture this array in order to read/modify the counters.
- }
- PC-Table
- ========
- **Experimental, may change or disappear in future**
- **Note:** this instrumentation might be incompatible with dead code stripping
- (``-Wl,-gc-sections``) for linkers other than LLD, thus resulting in a
- significant binary size overhead. For more information, see
- `Bug 34636 <https://bugs.llvm.org/show_bug.cgi?id=34636>`_.
- With ``-fsanitize-coverage=pc-table`` the compiler will create a table of
- instrumented PCs. Requires either ``-fsanitize-coverage=inline-8bit-counters`` or
- ``-fsanitize-coverage=trace-pc-guard``.
- Users need to implement a single function to capture the PC table at startup:
- .. code-block:: c++
- extern "C"
- void __sanitizer_cov_pcs_init(const uintptr_t *pcs_beg,
- const uintptr_t *pcs_end) {
- // [pcs_beg,pcs_end) is the array of ptr-sized integers representing
- // pairs [PC,PCFlags] for every instrumented block in the current DSO.
- // Capture this array in order to read the PCs and their Flags.
- // The number of PCs and PCFlags for a given DSO is the same as the number
- // of 8-bit counters (-fsanitize-coverage=inline-8bit-counters) or
- // trace_pc_guard callbacks (-fsanitize-coverage=trace-pc-guard)
- // A PCFlags describes the basic block:
- // * bit0: 1 if the block is the function entry block, 0 otherwise.
- }
- Tracing PCs
- ===========
- With ``-fsanitize-coverage=trace-pc`` the compiler will insert
- ``__sanitizer_cov_trace_pc()`` on every edge.
- With an additional ``...=trace-pc,indirect-calls`` flag
- ``__sanitizer_cov_trace_pc_indirect(void *callee)`` will be inserted on every indirect call.
- These callbacks are not implemented in the Sanitizer run-time and should be defined
- by the user.
- This mechanism is used for fuzzing the Linux kernel
- (https://github.com/google/syzkaller).
- Instrumentation points
- ======================
- Sanitizer Coverage offers different levels of instrumentation.
- * ``edge`` (default): edges are instrumented (see below).
- * ``bb``: basic blocks are instrumented.
- * ``func``: only the entry block of every function will be instrumented.
- Use these flags together with ``trace-pc-guard`` or ``trace-pc``,
- like this: ``-fsanitize-coverage=func,trace-pc-guard``.
- When ``edge`` or ``bb`` is used, some of the edges/blocks may still be left
- uninstrumented (pruned) if such instrumentation is considered redundant.
- Use ``no-prune`` (e.g. ``-fsanitize-coverage=bb,no-prune,trace-pc-guard``)
- to disable pruning. This could be useful for better coverage visualization.
- Edge coverage
- -------------
- Consider this code:
- .. code-block:: c++
- void foo(int *a) {
- if (a)
- *a = 0;
- }
- It contains 3 basic blocks, let's name them A, B, C:
- .. code-block:: none
- A
- |\
- | \
- | B
- | /
- |/
- C
- If blocks A, B, and C are all covered we know for certain that the edges A=>B
- and B=>C were executed, but we still don't know if the edge A=>C was executed.
- Such edges of control flow graph are called
- `critical <https://en.wikipedia.org/wiki/Control_flow_graph#Special_edges>`_.
- The edge-level coverage simply splits all critical edges by introducing new
- dummy blocks and then instruments those blocks:
- .. code-block:: none
- A
- |\
- | \
- D B
- | /
- |/
- C
- Tracing data flow
- =================
- Support for data-flow-guided fuzzing.
- With ``-fsanitize-coverage=trace-cmp`` the compiler will insert extra instrumentation
- around comparison instructions and switch statements.
- Similarly, with ``-fsanitize-coverage=trace-div`` the compiler will instrument
- integer division instructions (to capture the right argument of division)
- and with ``-fsanitize-coverage=trace-gep`` --
- the `LLVM GEP instructions <https://llvm.org/docs/GetElementPtr.html>`_
- (to capture array indices).
- Unless ``no-prune`` option is provided, some of the comparison instructions
- will not be instrumented.
- .. code-block:: c++
- // Called before a comparison instruction.
- // Arg1 and Arg2 are arguments of the comparison.
- void __sanitizer_cov_trace_cmp1(uint8_t Arg1, uint8_t Arg2);
- void __sanitizer_cov_trace_cmp2(uint16_t Arg1, uint16_t Arg2);
- void __sanitizer_cov_trace_cmp4(uint32_t Arg1, uint32_t Arg2);
- void __sanitizer_cov_trace_cmp8(uint64_t Arg1, uint64_t Arg2);
- // Called before a comparison instruction if exactly one of the arguments is constant.
- // Arg1 and Arg2 are arguments of the comparison, Arg1 is a compile-time constant.
- // These callbacks are emitted by -fsanitize-coverage=trace-cmp since 2017-08-11
- void __sanitizer_cov_trace_const_cmp1(uint8_t Arg1, uint8_t Arg2);
- void __sanitizer_cov_trace_const_cmp2(uint16_t Arg1, uint16_t Arg2);
- void __sanitizer_cov_trace_const_cmp4(uint32_t Arg1, uint32_t Arg2);
- void __sanitizer_cov_trace_const_cmp8(uint64_t Arg1, uint64_t Arg2);
- // Called before a switch statement.
- // Val is the switch operand.
- // Cases[0] is the number of case constants.
- // Cases[1] is the size of Val in bits.
- // Cases[2:] are the case constants.
- void __sanitizer_cov_trace_switch(uint64_t Val, uint64_t *Cases);
- // Called before a division statement.
- // Val is the second argument of division.
- void __sanitizer_cov_trace_div4(uint32_t Val);
- void __sanitizer_cov_trace_div8(uint64_t Val);
- // Called before a GetElemementPtr (GEP) instruction
- // for every non-constant array index.
- void __sanitizer_cov_trace_gep(uintptr_t Idx);
- Default implementation
- ======================
- The sanitizer run-time (AddressSanitizer, MemorySanitizer, etc) provide a
- default implementations of some of the coverage callbacks.
- You may use this implementation to dump the coverage on disk at the process
- exit.
- Example:
- .. code-block:: console
- % cat -n cov.cc
- 1 #include <stdio.h>
- 2 __attribute__((noinline))
- 3 void foo() { printf("foo\n"); }
- 4
- 5 int main(int argc, char **argv) {
- 6 if (argc == 2)
- 7 foo();
- 8 printf("main\n");
- 9 }
- % clang++ -g cov.cc -fsanitize=address -fsanitize-coverage=trace-pc-guard
- % ASAN_OPTIONS=coverage=1 ./a.out; wc -c *.sancov
- main
- SanitizerCoverage: ./a.out.7312.sancov 2 PCs written
- 24 a.out.7312.sancov
- % ASAN_OPTIONS=coverage=1 ./a.out foo ; wc -c *.sancov
- foo
- main
- SanitizerCoverage: ./a.out.7316.sancov 3 PCs written
- 24 a.out.7312.sancov
- 32 a.out.7316.sancov
- Every time you run an executable instrumented with SanitizerCoverage
- one ``*.sancov`` file is created during the process shutdown.
- If the executable is dynamically linked against instrumented DSOs,
- one ``*.sancov`` file will be also created for every DSO.
- Sancov data format
- ------------------
- The format of ``*.sancov`` files is very simple: the first 8 bytes is the magic,
- one of ``0xC0BFFFFFFFFFFF64`` and ``0xC0BFFFFFFFFFFF32``. The last byte of the
- magic defines the size of the following offsets. The rest of the data is the
- offsets in the corresponding binary/DSO that were executed during the run.
- Sancov Tool
- -----------
- An simple ``sancov`` tool is provided to process coverage files.
- The tool is part of LLVM project and is currently supported only on Linux.
- It can handle symbolization tasks autonomously without any extra support
- from the environment. You need to pass .sancov files (named
- ``<module_name>.<pid>.sancov`` and paths to all corresponding binary elf files.
- Sancov matches these files using module names and binaries file names.
- .. code-block:: console
- USAGE: sancov [options] <action> (<binary file>|<.sancov file>)...
- Action (required)
- -print - Print coverage addresses
- -covered-functions - Print all covered functions.
- -not-covered-functions - Print all not covered functions.
- -symbolize - Symbolizes the report.
- Options
- -blacklist=<string> - Blacklist file (sanitizer blacklist format).
- -demangle - Print demangled function name.
- -strip_path_prefix=<string> - Strip this prefix from file paths in reports
- Coverage Reports
- ----------------
- **Experimental**
- ``.sancov`` files do not contain enough information to generate a source-level
- coverage report. The missing information is contained
- in debug info of the binary. Thus the ``.sancov`` has to be symbolized
- to produce a ``.symcov`` file first:
- .. code-block:: console
- sancov -symbolize my_program.123.sancov my_program > my_program.123.symcov
- The ``.symcov`` file can be browsed overlayed over the source code by
- running ``tools/sancov/coverage-report-server.py`` script that will start
- an HTTP server.
- Output directory
- ----------------
- By default, .sancov files are created in the current working directory.
- This can be changed with ``ASAN_OPTIONS=coverage_dir=/path``:
- .. code-block:: console
- % ASAN_OPTIONS="coverage=1:coverage_dir=/tmp/cov" ./a.out foo
- % ls -l /tmp/cov/*sancov
- -rw-r----- 1 kcc eng 4 Nov 27 12:21 a.out.22673.sancov
- -rw-r----- 1 kcc eng 8 Nov 27 12:21 a.out.22679.sancov
|