Debugging the Analyzer (GNU Compiler Collection (GCC) Internals)

Previous: Analyzer Internals, Up: Static Analyzer [Contents][Index]

27.2 Debugging the Analyzer ¶

When debugging the analyzer I normally use all of these options together:

./xgcc -B. \
  -S \
  -fanalyzer \
  OTHER_GCC_ARGS \
  -wrapper gdb,--args \
  -fdump-analyzer-stderr \
  -fanalyzer-fine-grained \
  -fdump-ipa-analyzer=stderr

where:

./xgcc -B. is the usual way to invoke a self-built GCC from within the BUILDDIR/gcc subdirectory.
-S so that the driver (./xgcc) invokes cc1, but doesn’t bother running the assembler or linker (since the analyzer runs inside cc1).
-fanalyzer enables the analyzer, obviously.
-wrapper gdb,--args invokes cc1 under the debugger so that I can debug cc1 and set breakpoints and step through things.
-fdump-analyzer-stderr so that the logging interface is enabled and goes to stderr, which often gives valuable context into what’s happening when stepping through the analyzer
-fanalyzer-fine-grained which splits the effect of every statement into its own exploded_node, rather than the default (which tries to combine successive stmts to reduce the size of the exploded_graph). This makes it easier to see exactly where a particular change happens.
-fdump-ipa-analyzer=stderr which dumps the GIMPLE IR seen by the analyzer pass to stderr

Other useful options:

-fdump-analyzer-exploded-graph which dumps a SRC.eg.dot GraphViz file that I can look at (with python-xdot)
-fdump-analyzer-exploded-nodes-2 which dumps a SRC.eg.txt file containing the full exploded_graph.

Assuming that you have the python support scripts for gdb installed (which you should do, it makes debugging GCC much easier), you can use:

(gdb) break-on-saved-diagnostic

to put a breakpoint at the place where a diagnostic is saved during exploded_graph exploration, to see where a particular diagnostic is being saved, and:

(gdb) break-on-diagnostic

to put a breakpoint at the place where diagnostics are actually emitted.

Special Functions for Debugging the Analyzer
Other Debugging Techniques

27.2.1 Special Functions for Debugging the Analyzer ¶

The analyzer recognizes various special functions by name, for use in debugging the analyzer, and for use in DejaGnu tests.

The declarations of these functions can be seen in the testsuite in analyzer-decls.h. None of these functions are actually implemented in terms of code, merely as known_function subclasses (in gcc/analyzer/kf-analyzer.cc).

__analyzer_break

Add:

  __analyzer_break ();

to the source being analyzed to trigger a breakpoint in the analyzer when that source is reached. By putting a series of these in the source, it’s much easier to effectively step through the program state as it’s analyzed.

__analyzer_describe

The analyzer handles:

__analyzer_describe (0, expr);

by emitting a warning describing the 2nd argument (which can be of any type), at a verbosity level given by the 1st argument. This is for use when debugging, and may be of use in DejaGnu tests.

__analyzer_dump

__analyzer_dump ();

will dump the copious information about the analyzer’s state each time it reaches the call in its traversal of the source.

__analyzer_dump_capacity

extern void __analyzer_dump_capacity (const void *ptr);

will emit a warning describing the capacity of the base region of the region pointed to by the 1st argument.

__analyzer_dump_escaped

extern void __analyzer_dump_escaped (void);

will emit a warning giving the number of decls that have escaped on this analysis path, followed by a comma-separated list of their names, in alphabetical order.

__analyzer_dump_path

__analyzer_dump_path ();

will emit a placeholder “note” diagnostic with a path to that call site, if the analyzer finds a feasible path to it. This can be useful for writing DejaGnu tests for constraint-tracking and feasibility checking.

__analyzer_dump_exploded_nodes

For every callsite to __analyzer_dump_exploded_nodes the analyzer will emit a warning after it finished the analysis containing information on all of the exploded nodes at that program point.

  __analyzer_dump_exploded_nodes (0);

will output the number of “processed” nodes, and the IDs of both “processed” and “merger” nodes, such as:

warning: 2 processed enodes: [EN: 56, EN: 58] merger(s): [EN: 54-55, EN: 57, EN: 59]

With a non-zero argument

  __analyzer_dump_exploded_nodes (1);

it will also dump all of the states within the “processed” nodes.

__analyzer_dump_named_constant

When the analyzer sees a call to __analyzer_dump_named_constant it will emit a warning describing what is known about the value of a given named constant, for parts of the analyzer that interact with target headers.

For example:

__analyzer_dump_named_constant ("O_RDONLY");

might lead to the analyzer emitting the warning:

warning: named constant 'O_RDONLY' has value '1'

__analyzer_dump_region_model

   __analyzer_dump_region_model ();

will dump the region_model’s state to stderr.

__analyzer_dump_state

__analyzer_dump_state ("malloc", ptr);

will emit a warning describing the state of the 2nd argument (which can be of any type) with respect to the state machine with a name matching the 1st argument (which must be a string literal). This is for use when debugging, and may be of use in DejaGnu tests.

__analyzer_eval

__analyzer_eval (expr);

will emit a warning with text "TRUE", FALSE" or "UNKNOWN" based on the truthfulness of the argument. This is useful for writing DejaGnu tests.

__analyzer_get_unknown_ptr

__analyzer_get_unknown_ptr ();

will obtain an unknown void *.

__analyzer_get_strlen

__analyzer_get_strlen (buf);

will emit a warning if PTR doesn’t point to a null-terminated string. TODO: eventually get the strlen of the buffer (without the optimizer touching it).

27.2.2 Other Debugging Techniques ¶

To compare two different exploded graphs, try -fdump-analyzer-exploded-nodes-2 -fdump-noaddr -fanalyzer-fine-grained. This will dump a SRC.eg.txt file containing the full exploded_graph. I use diff -u50 -p to compare two different such files (e.g. before and after a patch) to find the first place where the two graphs diverge. The option -fdump-noaddr will suppress printing pointers withihn the dumps (which would otherwise hide the real differences with irrelevent churn).

The option -fdump-analyzer-json will dump both the supergraph and the exploded graph in compressed JSON form.

One approach when tracking down where a particular bogus state is introduced into the exploded_graph is to add custom code to program_state::validate.

The debug function region::is_named_decl_p can be used when debugging, such as for assertions and conditional breakpoints. For example, when tracking down a bug in handling a decl called yy_buffer_stack, I temporarily added a:

  gcc_assert (!m_base_region->is_named_decl_p ("yy_buffer_stack"));

to binding_cluster::mark_as_escaped to trap a point where yy_buffer_stack was mistakenly being treated as having escaped.