123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453 |
- = Tracing =
- == Introduction ==
- This document describes the tracing infrastructure in QEMU and how to use it
- for debugging, profiling, and observing execution.
- == Quickstart ==
- 1. Build with the 'simple' trace backend:
- ./configure --enable-trace-backends=simple
- make
- 2. Create a file with the events you want to trace:
- echo memory_region_ops_read >/tmp/events
- 3. Run the virtual machine to produce a trace file:
- qemu --trace events=/tmp/events ... # your normal QEMU invocation
- 4. Pretty-print the binary trace file:
- ./scripts/simpletrace.py trace-events-all trace-* # Override * with QEMU <pid>
- == Trace events ==
- === Sub-directory setup ===
- Each directory in the source tree can declare a set of static trace events
- in a local "trace-events" file. All directories which contain "trace-events"
- files must be listed in the "trace-events-subdirs" make variable in the top
- level Makefile.objs. During build, the "trace-events" file in each listed
- subdirectory will be processed by the "tracetool" script to generate code for
- the trace events.
- The individual "trace-events" files are merged into a "trace-events-all" file,
- which is also installed into "/usr/share/qemu" with the name "trace-events".
- This merged file is to be used by the "simpletrace.py" script to later analyse
- traces in the simpletrace data format.
- In the sub-directory the following files will be automatically generated
- - trace.c - the trace event state declarations
- - trace.h - the trace event enums and probe functions
- - trace-dtrace.h - DTrace event probe specification
- - trace-dtrace.dtrace - DTrace event probe helper declaration
- - trace-dtrace.o - binary DTrace provider (generated by dtrace)
- - trace-ust.h - UST event probe helper declarations
- Source files in the sub-directory should #include the local 'trace.h' file,
- without any sub-directory path prefix. eg io/channel-buffer.c would do
- #include "trace.h"
- To access the 'io/trace.h' file. While it is possible to include a trace.h
- file from outside a source files' own sub-directory, this is discouraged in
- general. It is strongly preferred that all events be declared directly in
- the sub-directory that uses them. The only exception is where there are some
- shared trace events defined in the top level directory trace-events file.
- The top level directory generates trace files with a filename prefix of
- "trace-root" instead of just "trace". This is to avoid ambiguity between
- a trace.h in the current directory, vs the top level directory.
- === Using trace events ===
- Trace events are invoked directly from source code like this:
- #include "trace.h" /* needed for trace event prototype */
-
- void *qemu_vmalloc(size_t size)
- {
- void *ptr;
- size_t align = QEMU_VMALLOC_ALIGN;
-
- if (size < align) {
- align = getpagesize();
- }
- ptr = qemu_memalign(align, size);
- trace_qemu_vmalloc(size, ptr);
- return ptr;
- }
- === Declaring trace events ===
- The "tracetool" script produces the trace.h header file which is included by
- every source file that uses trace events. Since many source files include
- trace.h, it uses a minimum of types and other header files included to keep the
- namespace clean and compile times and dependencies down.
- Trace events should use types as follows:
- * Use stdint.h types for fixed-size types. Most offsets and guest memory
- addresses are best represented with uint32_t or uint64_t. Use fixed-size
- types over primitive types whose size may change depending on the host
- (32-bit versus 64-bit) so trace events don't truncate values or break
- the build.
- * Use void * for pointers to structs or for arrays. The trace.h header
- cannot include all user-defined struct declarations and it is therefore
- necessary to use void * for pointers to structs.
- * For everything else, use primitive scalar types (char, int, long) with the
- appropriate signedness.
- * Avoid floating point types (float and double) because SystemTap does not
- support them. In most cases it is possible to round to an integer type
- instead. This may require scaling the value first by multiplying it by 1000
- or the like when digits after the decimal point need to be preserved.
- Format strings should reflect the types defined in the trace event. Take
- special care to use PRId64 and PRIu64 for int64_t and uint64_t types,
- respectively. This ensures portability between 32- and 64-bit platforms.
- Format strings must not end with a newline character. It is the responsibility
- of backends to adapt line ending for proper logging.
- Each event declaration will start with the event name, then its arguments,
- finally a format string for pretty-printing. For example:
- qemu_vmalloc(size_t size, void *ptr) "size %zu ptr %p"
- qemu_vfree(void *ptr) "ptr %p"
- === Hints for adding new trace events ===
- 1. Trace state changes in the code. Interesting points in the code usually
- involve a state change like starting, stopping, allocating, freeing. State
- changes are good trace events because they can be used to understand the
- execution of the system.
- 2. Trace guest operations. Guest I/O accesses like reading device registers
- are good trace events because they can be used to understand guest
- interactions.
- 3. Use correlator fields so the context of an individual line of trace output
- can be understood. For example, trace the pointer returned by malloc and
- used as an argument to free. This way mallocs and frees can be matched up.
- Trace events with no context are not very useful.
- 4. Name trace events after their function. If there are multiple trace events
- in one function, append a unique distinguisher at the end of the name.
- == Generic interface and monitor commands ==
- You can programmatically query and control the state of trace events through a
- backend-agnostic interface provided by the header "trace/control.h".
- Note that some of the backends do not provide an implementation for some parts
- of this interface, in which case QEMU will just print a warning (please refer to
- header "trace/control.h" to see which routines are backend-dependent).
- The state of events can also be queried and modified through monitor commands:
- * info trace-events
- View available trace events and their state. State 1 means enabled, state 0
- means disabled.
- * trace-event NAME on|off
- Enable/disable a given trace event or a group of events (using wildcards).
- The "--trace events=<file>" command line argument can be used to enable the
- events listed in <file> from the very beginning of the program. This file must
- contain one event name per line.
- If a line in the "--trace events=<file>" file begins with a '-', the trace event
- will be disabled instead of enabled. This is useful when a wildcard was used
- to enable an entire family of events but one noisy event needs to be disabled.
- Wildcard matching is supported in both the monitor command "trace-event" and the
- events list file. That means you can enable/disable the events having a common
- prefix in a batch. For example, virtio-blk trace events could be enabled using
- the following monitor command:
- trace-event virtio_blk_* on
- == Trace backends ==
- The "tracetool" script automates tedious trace event code generation and also
- keeps the trace event declarations independent of the trace backend. The trace
- events are not tightly coupled to a specific trace backend, such as LTTng or
- SystemTap. Support for trace backends can be added by extending the "tracetool"
- script.
- The trace backends are chosen at configure time:
- ./configure --enable-trace-backends=simple
- For a list of supported trace backends, try ./configure --help or see below.
- If multiple backends are enabled, the trace is sent to them all.
- If no backends are explicitly selected, configure will default to the
- "log" backend.
- The following subsections describe the supported trace backends.
- === Nop ===
- The "nop" backend generates empty trace event functions so that the compiler
- can optimize out trace events completely. This imposes no performance
- penalty.
- Note that regardless of the selected trace backend, events with the "disable"
- property will be generated with the "nop" backend.
- === Log ===
- The "log" backend sends trace events directly to standard error. This
- effectively turns trace events into debug printfs.
- This is the simplest backend and can be used together with existing code that
- uses DPRINTF().
- === Simpletrace ===
- The "simple" backend supports common use cases and comes as part of the QEMU
- source tree. It may not be as powerful as platform-specific or third-party
- trace backends but it is portable. This is the recommended trace backend
- unless you have specific needs for more advanced backends.
- === Ftrace ===
- The "ftrace" backend writes trace data to ftrace marker. This effectively
- sends trace events to ftrace ring buffer, and you can compare qemu trace
- data and kernel(especially kvm.ko when using KVM) trace data.
- if you use KVM, enable kvm events in ftrace:
- # echo 1 > /sys/kernel/debug/tracing/events/kvm/enable
- After running qemu by root user, you can get the trace:
- # cat /sys/kernel/debug/tracing/trace
- Restriction: "ftrace" backend is restricted to Linux only.
- === Syslog ===
- The "syslog" backend sends trace events using the POSIX syslog API. The log
- is opened specifying the LOG_DAEMON facility and LOG_PID option (so events
- are tagged with the pid of the particular QEMU process that generated
- them). All events are logged at LOG_INFO level.
- NOTE: syslog may squash duplicate consecutive trace events and apply rate
- limiting.
- Restriction: "syslog" backend is restricted to POSIX compliant OS.
- ==== Monitor commands ====
- * trace-file on|off|flush|set <path>
- Enable/disable/flush the trace file or set the trace file name.
- ==== Analyzing trace files ====
- The "simple" backend produces binary trace files that can be formatted with the
- simpletrace.py script. The script takes the "trace-events-all" file and the
- binary trace:
- ./scripts/simpletrace.py trace-events-all trace-12345
- You must ensure that the same "trace-events-all" file was used to build QEMU,
- otherwise trace event declarations may have changed and output will not be
- consistent.
- === LTTng Userspace Tracer ===
- The "ust" backend uses the LTTng Userspace Tracer library. There are no
- monitor commands built into QEMU, instead UST utilities should be used to list,
- enable/disable, and dump traces.
- Package lttng-tools is required for userspace tracing. You must ensure that the
- current user belongs to the "tracing" group, or manually launch the
- lttng-sessiond daemon for the current user prior to running any instance of
- QEMU.
- While running an instrumented QEMU, LTTng should be able to list all available
- events:
- lttng list -u
- Create tracing session:
- lttng create mysession
- Enable events:
- lttng enable-event qemu:g_malloc -u
- Where the events can either be a comma-separated list of events, or "-a" to
- enable all tracepoint events. Start and stop tracing as needed:
- lttng start
- lttng stop
- View the trace:
- lttng view
- Destroy tracing session:
- lttng destroy
- Babeltrace can be used at any later time to view the trace:
- babeltrace $HOME/lttng-traces/mysession-<date>-<time>
- === SystemTap ===
- The "dtrace" backend uses DTrace sdt probes but has only been tested with
- SystemTap. When SystemTap support is detected a .stp file with wrapper probes
- is generated to make use in scripts more convenient. This step can also be
- performed manually after a build in order to change the binary name in the .stp
- probes:
- scripts/tracetool.py --backends=dtrace --format=stap \
- --binary path/to/qemu-binary \
- --target-type system \
- --target-name x86_64 \
- --group=all \
- trace-events-all >qemu.stp
- To facilitate simple usage of systemtap where there merely needs to be printf
- logging of certain probes, a helper script "qemu-trace-stap" is provided.
- Consult its manual page for guidance on its usage.
- == Trace event properties ==
- Each event in the "trace-events-all" file can be prefixed with a space-separated
- list of zero or more of the following event properties.
- === "disable" ===
- If a specific trace event is going to be invoked a huge number of times, this
- might have a noticeable performance impact even when the event is
- programmatically disabled.
- In this case you should declare such event with the "disable" property. This
- will effectively disable the event at compile time (by using the "nop" backend),
- thus having no performance impact at all on regular builds (i.e., unless you
- edit the "trace-events-all" file).
- In addition, there might be cases where relatively complex computations must be
- performed to generate values that are only used as arguments for a trace
- function. In these cases you can use the macro 'TRACE_${EVENT_NAME}_ENABLED' to
- guard such computations and avoid its compilation when the event is disabled:
- #include "trace.h" /* needed for trace event prototype */
-
- void *qemu_vmalloc(size_t size)
- {
- void *ptr;
- size_t align = QEMU_VMALLOC_ALIGN;
-
- if (size < align) {
- align = getpagesize();
- }
- ptr = qemu_memalign(align, size);
- if (TRACE_QEMU_VMALLOC_ENABLED) { /* preprocessor macro */
- void *complex;
- /* some complex computations to produce the 'complex' value */
- trace_qemu_vmalloc(size, ptr, complex);
- }
- return ptr;
- }
- You can check both if the event has been disabled and is dynamically enabled at
- the same time using the 'trace_event_get_state_backends' routine (see header
- "trace/control.h" for more information).
- === "tcg" ===
- Guest code generated by TCG can be traced by defining an event with the "tcg"
- event property. Internally, this property generates two events:
- "<eventname>_trans" to trace the event at translation time, and
- "<eventname>_exec" to trace the event at execution time.
- Instead of using these two events, you should instead use the function
- "trace_<eventname>_tcg" during translation (TCG code generation). This function
- will automatically call "trace_<eventname>_trans", and will generate the
- necessary TCG code to call "trace_<eventname>_exec" during guest code execution.
- Events with the "tcg" property can be declared in the "trace-events" file with a
- mix of native and TCG types, and "trace_<eventname>_tcg" will gracefully forward
- them to the "<eventname>_trans" and "<eventname>_exec" events. Since TCG values
- are not known at translation time, these are ignored by the "<eventname>_trans"
- event. Because of this, the entry in the "trace-events" file needs two printing
- formats (separated by a comma):
- tcg foo(uint8_t a1, TCGv_i32 a2) "a1=%d", "a1=%d a2=%d"
- For example:
- #include "trace-tcg.h"
-
- void some_disassembly_func (...)
- {
- uint8_t a1 = ...;
- TCGv_i32 a2 = ...;
- trace_foo_tcg(a1, a2);
- }
- This will immediately call:
- void trace_foo_trans(uint8_t a1);
- and will generate the TCG code to call:
- void trace_foo(uint8_t a1, uint32_t a2);
- === "vcpu" ===
- Identifies events that trace vCPU-specific information. It implicitly adds a
- "CPUState*" argument, and extends the tracing print format to show the vCPU
- information. If used together with the "tcg" property, it adds a second
- "TCGv_env" argument that must point to the per-target global TCG register that
- points to the vCPU when guest code is executed (usually the "cpu_env" variable).
- The "tcg" and "vcpu" properties are currently only honored in the root
- ./trace-events file.
- The following example events:
- foo(uint32_t a) "a=%x"
- vcpu bar(uint32_t a) "a=%x"
- tcg vcpu baz(uint32_t a) "a=%x", "a=%x"
- Can be used as:
- #include "trace-tcg.h"
-
- CPUArchState *env;
- TCGv_ptr cpu_env;
-
- void some_disassembly_func(...)
- {
- /* trace emitted at this point */
- trace_foo(0xd1);
- /* trace emitted at this point */
- trace_bar(env_cpu(env), 0xd2);
- /* trace emitted at this point (env) and when guest code is executed (cpu_env) */
- trace_baz_tcg(env_cpu(env), cpu_env, 0xd3);
- }
- If the translating vCPU has address 0xc1 and code is later executed by vCPU
- 0xc2, this would be an example output:
- // at guest code translation
- foo a=0xd1
- bar cpu=0xc1 a=0xd2
- baz_trans cpu=0xc1 a=0xd3
- // at guest code execution
- baz_exec cpu=0xc2 a=0xd3
|