When debugging the analyzer I normally use all of these options together:
./xgcc -B. \ -S \ -fanalyzer \ OTHER_GCC_ARGS \ -wrapper gdb,--args \ -fdump-analyzer-stderr \ -fanalyzer-fine-grained \ -fdump-ipa-analyzer=stderr
where:
./xgcc -B.
is the usual way to invoke a self-built GCC from within the BUILDDIR/gcc
subdirectory.
-S
so that the driver (./xgcc
) invokes cc1
, but doesn’t bother
running the assembler or linker (since the analyzer runs inside cc1
).
-fanalyzer
enables the analyzer, obviously.
-wrapper gdb,--args
invokes cc1
under the debugger so that I can debug cc1
and
set breakpoints and step through things.
-fdump-analyzer-stderr
so that the logging interface is enabled and goes to stderr, which often
gives valuable context into what’s happening when stepping through the
analyzer
-fanalyzer-fine-grained
which splits the effect of every statement into its own
exploded_node, rather than the default (which tries to combine
successive stmts to reduce the size of the exploded_graph). This makes
it easier to see exactly where a particular change happens.
-fdump-ipa-analyzer=stderr
which dumps the GIMPLE IR seen by the analyzer pass to stderr
Other useful options:
-fdump-analyzer-exploded-graph
which dumps a SRC.eg.dot GraphViz file that I can look at (with
python-xdot)
-fdump-analyzer-exploded-nodes-2
which dumps a SRC.eg.txt file containing the full exploded_graph
.
Assuming that you have the python support scripts for gdb installed (which you should do, it makes debugging GCC much easier), you can use:
(gdb) break-on-saved-diagnostic
to put a breakpoint at the place where a diagnostic is saved during
exploded_graph
exploration, to see where a particular diagnostic
is being saved, and:
(gdb) break-on-diagnostic
to put a breakpoint at the place where diagnostics are actually emitted.
The analyzer recognizes various special functions by name, for use in debugging the analyzer, and for use in DejaGnu tests.
The declarations of these functions can be seen in the testsuite
in analyzer-decls.h. None of these functions are actually
implemented in terms of code, merely as known_function
subclasses
(in gcc/analyzer/kf-analyzer.cc).
__analyzer_break
Add:
__analyzer_break ();
to the source being analyzed to trigger a breakpoint in the analyzer when that source is reached. By putting a series of these in the source, it’s much easier to effectively step through the program state as it’s analyzed.
__analyzer_describe
The analyzer handles:
__analyzer_describe (0, expr);
by emitting a warning describing the 2nd argument (which can be of any type), at a verbosity level given by the 1st argument. This is for use when debugging, and may be of use in DejaGnu tests.
__analyzer_dump
__analyzer_dump ();
will dump the copious information about the analyzer’s state each time it reaches the call in its traversal of the source.
__analyzer_dump_capacity
extern void __analyzer_dump_capacity (const void *ptr);
will emit a warning describing the capacity of the base region of the region pointed to by the 1st argument.
__analyzer_dump_escaped
extern void __analyzer_dump_escaped (void);
will emit a warning giving the number of decls that have escaped on this analysis path, followed by a comma-separated list of their names, in alphabetical order.
__analyzer_dump_path
__analyzer_dump_path ();
will emit a placeholder “note” diagnostic with a path to that call site, if the analyzer finds a feasible path to it. This can be useful for writing DejaGnu tests for constraint-tracking and feasibility checking.
__analyzer_dump_exploded_nodes
For every callsite to __analyzer_dump_exploded_nodes
the analyzer
will emit a warning after it finished the analysis containing information
on all of the exploded nodes at that program point.
__analyzer_dump_exploded_nodes (0);
will output the number of “processed” nodes, and the IDs of both “processed” and “merger” nodes, such as:
warning: 2 processed enodes: [EN: 56, EN: 58] merger(s): [EN: 54-55, EN: 57, EN: 59]
With a non-zero argument
__analyzer_dump_exploded_nodes (1);
it will also dump all of the states within the “processed” nodes.
__analyzer_dump_named_constant
When the analyzer sees a call to __analyzer_dump_named_constant
it
will emit a warning describing what is known about the value of a given
named constant, for parts of the analyzer that interact with target
headers.
For example:
__analyzer_dump_named_constant ("O_RDONLY");
might lead to the analyzer emitting the warning:
warning: named constant 'O_RDONLY' has value '1'
__analyzer_dump_region_model
__analyzer_dump_region_model ();
will dump the region_model’s state to stderr.
__analyzer_dump_state
__analyzer_dump_state ("malloc", ptr);
will emit a warning describing the state of the 2nd argument (which can be of any type) with respect to the state machine with a name matching the 1st argument (which must be a string literal). This is for use when debugging, and may be of use in DejaGnu tests.
__analyzer_eval
__analyzer_eval (expr);
will emit a warning with text "TRUE", FALSE" or "UNKNOWN" based on the truthfulness of the argument. This is useful for writing DejaGnu tests.
__analyzer_get_unknown_ptr
__analyzer_get_unknown_ptr ();
will obtain an unknown void *
.
__analyzer_get_strlen
__analyzer_get_strlen (buf);
will emit a warning if PTR doesn’t point to a null-terminated string. TODO: eventually get the strlen of the buffer (without the optimizer touching it).
To compare two different exploded graphs, try
-fdump-analyzer-exploded-nodes-2 -fdump-noaddr -fanalyzer-fine-grained
.
This will dump a SRC.eg.txt file containing the full
exploded_graph
. I use diff -u50 -p
to compare two different
such files (e.g. before and after a patch) to find the first place where the
two graphs diverge. The option -fdump-noaddr will suppress
printing pointers withihn the dumps (which would otherwise hide the real
differences with irrelevent churn).
The option -fdump-analyzer-json will dump both the supergraph and the exploded graph in compressed JSON form.
One approach when tracking down where a particular bogus state is
introduced into the exploded_graph
is to add custom code to
program_state::validate
.
The debug function region::is_named_decl_p
can be used when debugging,
such as for assertions and conditional breakpoints. For example, when
tracking down a bug in handling a decl called yy_buffer_stack
, I
temporarily added a:
gcc_assert (!m_base_region->is_named_decl_p ("yy_buffer_stack"));
to binding_cluster::mark_as_escaped
to trap a point where
yy_buffer_stack
was mistakenly being treated as having escaped.