Prepare the Sources

The purpose of this chapter is to explain how to prepare the source code of an application before starting to analyze it. The main steps to perform are:

  • Finding out the preprocessing options.

    This step can either be manual (Manual Preparation) or automatic (Automatic Preparation).

    The manual preparation is the easiest way to start with if you already know the commands necessary to compile the source files. Otherwise, start instead with the automatic preparation.

  • Dealing with the external libraries.

Manual Preparation

Preprocessing

Usual pre-processing options

In the simplest cases, all the source files need the same preprocessing command. The default preprocessing command of tis-analyzer is:

clang -C -E -nostdinc -isystem TIS_KERNEL_SHARE/libc

Some more options can be added to this command with the -cpp-extra-args option. The whole command can also be specified directly with the -cpp-command option, for instance in order to use another preprocessor.

The -I and -isystem (to add include paths), -D (to add macro definitions), and -U (to remove macro definitions) options are provided as shortcuts to the -cpp-extra-args option.

For example, the following command can be used to run the analyzer on the f1.c, f2.c, and f3.c source files, taking the included files from the incl_dir directory:

$ tis-analyzer -I incl_dir f1.c f2.c f3.c

A specific preprocessing command can be given to a set of specific files with the option -cpp-command-file "f1.c:clang -C -E,f2.c:gcc -C -E" (or -cpp-command-file "f1.c:clang -C -E" -cpp-command-file "f2.c:gcc -C -E". More options can be added to a preprocessing command for a set of files in the same way with the option -cpp-extra-args-file "f1.c:-Idir/".

Any file not listed in -cpp-command-file (resp. -cpp-extra-args-file) will use the global command (resp. additional options) of the -cpp-command option (resp. -cpp-extra-args option).

If most of the source files need to have a specific preprocessing command, it is recommended to use the Automatic Preparation.

The exact pre-processing command in use can be shown by adding the command line option -kernel-msg-key pp when running the analyzer.

Advanced pre-processing options

In some applications, the source code is split in modules that require different preprocessing commands.

Warning

First of all, an important recommendation is to tackle the software in as small-sized chunks as possible. This makes the most of pre-processing problems go away.

If a particular source file needs a different preprocessing command, it is better to preprocess it first. The result file has to be named with a .i or .ci extension so the analyzer knows that it does not need to preprocess it. The difference between the two extensions is that the .i files are not preprocessed at all by the tool, whether the macro definitions are expanded in the annotations of the .ci files, which is most of the time the intended behavior. So except in some special cases, the .ci extension has to be preferred.

Source files and preprocessed files can be mixed in the command line. For instance, if the f3.c file needs some special options, f3.ci can be generated beforehand, and then used in the command line:

$ tis-analyzer -I incl_dir f1.c f2.c f3.ci

This will give the same result as the previous command, provided that f3.c has already been preprocessed into f3.ci.

Here is a synthetic example with two files h1.c and h2.c that use the same macro M which needs to have a different definition in each file.

File h1.c:
 int x = M;
 extern int y;

 int main(void) {
   return x + y;
 }
File h2.c:
 int y = M;

If M is supposed to be 1 in h1.c and 2 in h2.c the recommended command lines for this example are:

$ clang -C -E -nostdinc -DM=1 -o h1.tis.ci h1.c
$ clang -C -E -nostdinc -DM=2 -o h2.tis.ci h2.c

Then, the generated files can be provided to the analyzer:

$ tis-analyzer -val h1.tis.ci h2.tis.ci

And the obtained result shows that M has been correctly expanded:

...
[value] Values at end of function main:
    __retres ∈ {3}
...

In more complex cases, it is better to use the Automatic Preparation.

About Libraries

Most applications use some libraries, at least the standard libc. The analyzer needs to have information about the functions that are used by the application, at least the ones that are called in the part of it which is being studied.

For the libc library, some header files come with the tool and provide specifications to many of its functions. These header files are included by default when preprocessing source files. However, if the preprocessing is done before, the following option has to be employed in order to find the instrumented files:

-I$(tis-analyzer -print-share-path)/libc

The tool also provides implementations to some libc functions that are automatically loaded. They are either C source code or internal built-in functions. But the -no-tis-libc option may be used to completely ignore the tool’s library functions and header files. It can be useful when analyzing code with custom libc functions for instance.

Another intermediate solution is to use the --custom-libc <file> option. In that case, the given source file is analyzed before the tool runtime files. It gives the opportunity to overload some of the provided C implementations. The built-in functions cannot be individually be overloaded at the moment.

To overload some header files in case something is missing, the --custom-isystem <path> option can be used. Then the given include path is used before the tool ones. In that case, the custom headers xxx.h may include the tool headers with:

File <path>/xxx.h:
 #include <tis-kernel/libc/xxx.h>

 // some more declarations and/or specification for <xxx.h>

If other external functions are used, one has to provide some properties concerning each of them: at the minimum to specify which pieces of data can be modified by them. See Check the External Functions to know which functions have to be specified and Write a Specification to learn how to do it.

First Run

At this point, the source files and the preprocessing commands should have been retrieved. It is time to try the tool for the first time, for instance by running:

tis-analyzer -metrics <..source and preprocessed files..> <..preprocessing options>

The preprocessing options are only used when source files are provided. In complex cases, it can be easier to analyze only the already preprocessed files.

Automatic Preparation

This section describes how to automatically produce a compile_commands.json file that contains instructions on how to replay the compilation process independently of the build system.

Description of the format

A compilation database is a JSON file, which consist of an array of “command objects”, where each command object specifies one way a translation unit is compiled in the project.

Each command object contains the translation unit’s main file, the working directory where the compiler ran and the actual compile command.

See the online documentation for more information:

How to produce a compile_commands.json

  • CMake (since 2.8.5) supports generation of compilation databases for Unix Makefile builds with the option CMAKE_EXPORT_COMPILE_COMMANDS.

    Usage:

    cmake <options> -DCMAKE_EXPORT_COMPILE_COMMANDS=ON <path-to-source>
    
  • For projects on Linux, there is an alternative to intercept compiler calls with a more generic tool called bear.

    Usage:

    bear <compilation_command>
    

Tip

It is recommended to use bear. It can be installed with the packet manager, typically:

sudo apt install bear

Using the compile_commands.json

In order to use the produced compilation database, run TrustInSoft Analyzer with the following command:

tis-analyzer -compilation-database path/to/compile_commands.json ...

Also, if a directory is given to the -compilation-database option, it will scan and use every compile_commands.json file located in the given directory and its sub-directories.

tis-analyzer -compilation-database path/to/project ...

It is also possible to use compilation databases in a tis.config file for the analysis.

A possible generic template for the tis.config file is given below (see Configuration files for more information about tis.config files).

{
    "compilation_database":
    [
        "path/to/compile_commands.json"
    ],
    "files":
    [
        "path/to/file_1",
        "path/to/file_2",
        "path/to/file_N"
    ],
    "machdep": "gcc_x86_64",
    "main": "main",
    "val": true,
    "slevel-function":
    {
        "function_name": 10
    }
}

To use the tis.config file, run TrustInSoft Analyzer with the following command:

tis-analyzer -tis-config-load tis.config

Note

The tis.config file uses a strict syntax for JSON. A typical mistake would be to put a comma for the last line of an object, e.g. for the line "path/to/file_N", and it would lead to an error.

Check Preparation

At this point, whatever method was chosen for the preparation step, you should, for instance, be able to execute:

tis-analyzer -metrics <... arguments...>

with the appropriate arguments, the analyzer should run with no errors. Using the command tis-analyzer-gui with the same arguments starts the GUI which lets you browse through the source code, but not see the analysis results yet, since nothing has been computed at the moment.

It is often useful to save the results of an analysis with:

tis-analyzer ... -save project.state > project.log

This command puts all the messages in the file project.log and saves the state of the project itself to the file project.state, so that it can be loaded later on. For instance, we can load it now in the GUI by executing:

tis-analyzer-gui -load project.state

In case the application includes some special features (assembler code, etc.) and/or requires to be studied for a specific hardware target and/or with specific compiler options, please refer to Dealing with Special Features.