Skip to content

Commit

Permalink
Populating module documentation md files
Browse files Browse the repository at this point in the history
by distributiong M. Brain's brief module descriptions from
the old wiki
  • Loading branch information
peterschrammel committed Mar 22, 2018
1 parent 2154f54 commit 472f0ac
Show file tree
Hide file tree
Showing 24 changed files with 423 additions and 544 deletions.
548 changes: 51 additions & 497 deletions doc/architectural/cprover-architecture-overview.md

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions src/analyses/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
\ingroup module_hidden
\defgroup analyses analyses

# Folder analyses

This contains the abstract interpretation framework `ai.h` and several
static analyses that instantiate it.

FIXME: put here a good introduction describing what is contained
in this folder.
63 changes: 41 additions & 22 deletions src/ansi-c/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,30 +2,26 @@
\defgroup ansi-c ansi-c
# Folder ansi-c

\author Kareem Khazem
\author Kareem Khazem, Martin Brain

CodeWarrior C Compilers Reference 3.2:

http://cache.freescale.com/files/soft_dev_tools/doc/ref_manual/CCOMPILERRM.pdf
\section overview Overview

http://cache.freescale.com/files/soft_dev_tools/doc/ref_manual/ASMX86RM.pdf
Contains the front-end for ANSI C, plus a variety of common extensions.
This parses the file, performs some basic sanity checks (this is one
area in which the UI could be improved; patches most welcome) and then
produces a goto-program (see below). The parser is a traditional Flex /
Bison system.

ARM 4.1 Compiler Reference:
`internal_addition.c` contains the implementation of various ‘magic’
functions that are that allow control of the analysis from the source
code level. These include assertions, assumptions, atomic blocks, memory
fences and rounding modes.

http://infocenter.arm.com/help/topic/com.arm.doc.dui0491c/DUI0491C_arm_compiler_reference.pdf


Parsing performance considerations:

* Measured on trunk/regression/ansi-c/windows_h_VS_2012/main.i

* 13%: Copying into i_preprocessed

* 5%: ansi_c_parser.read()

* 53%: yyansi_clex()

* 29%: parser (without typechecking)
The `library/` subdirectory contains versions of some of the C standard
header files that make use of the CPROVER built-in functions. This
allows CPROVER programs to be ‘aware’ of the functionality and model it
correctly. Examples include `stdio.c`, `string.c`, `setjmp.c` and
various threading interfaces.

\section preprocessing Preprocessing & Parsing

Expand All @@ -48,8 +44,6 @@ digraph G {
\enddot



---
\section type-checking Type-checking

In the \ref ansi-c and \ref java_bytecode directories.
Expand Down Expand Up @@ -136,3 +130,28 @@ called symbols. Thus, for example:
parameter and return types of the function. The value of the symbol is
the function's body (a \ref codet), and the symbol is stored in the
symbol table with `foo` as the key.


\section performance Parsing performance considerations

* Measured on trunk/regression/ansi-c/windows_h_VS_2012/main.i

* 13%: Copying into i_preprocessed

* 5%: ansi_c_parser.read()

* 53%: yyansi_clex()

* 29%: parser (without typechecking)

\section references Compiler References

CodeWarrior C Compilers Reference 3.2:

http://cache.freescale.com/files/soft_dev_tools/doc/ref_manual/CCOMPILERRM.pdf

http://cache.freescale.com/files/soft_dev_tools/doc/ref_manual/ASMX86RM.pdf

ARM 4.1 Compiler Reference:

http://infocenter.arm.com/help/topic/com.arm.doc.dui0491c/DUI0491C_arm_compiler_reference.pdf
3 changes: 3 additions & 0 deletions src/big-int/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
\ingroup module_hidden
\defgroup big-int big-int

# Folder big-int

\author Martin Brain

CPROVER is distributed with its own multi-precision arithmetic library;
mainly for historical and portability reasons. The library is externally
developed and thus `big-int` contains the source as it is distributed.
Expand Down
7 changes: 6 additions & 1 deletion src/cbmc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,9 @@
\defgroup cbmc cbmc
# Folder CBMC

The CBMC handles the code related to interacting with CBMC.
This contains the first full application. CBMC is a bounded model
checker that uses the front ends (`ansi-c`, `cpp`, goto-program or
others) to create a goto-program, `goto-symex` to unwind the loops the
given number of times and to produce and equation system and finally
`solvers` to find a counter-example (technically, `goto-symex` is then
used to construct the counter-example trace).
14 changes: 13 additions & 1 deletion src/cpp/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,17 @@
\ingroup module_hidden
\defgroup cpp cpp

# Folder cpp

The C++ Language front-end is for processing C++.
\author Martin Brain

This directory contains the C++ front-end. It supports the subset of C++
commonly found in embedded and system applications. Consequentially it
doesn’t have full support for templates and many of the more advanced
and obscure C++ features. The subset of the language that can be handled
is being extended over time so bug reports of programs that cannot be
parsed are useful.

The functionality is very similar to the ANSI C front end; parsing the
code and converting to goto-programs. It makes use of code from
`langapi` and `ansi-c`.
6 changes: 4 additions & 2 deletions src/goto-analyzer/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
\ingroup module_hidden
\defgroup goto-analyzer goto-analyzer

# Folder goto-analyzer

`goto-analyzer/` is a module stores information related to interacting with
goto-analyzer. These files are medium risk to change and change frequently.
`goto-analyzer/` is a tool performing static analyses on goto
programs. It provides the front end for many of the static analyses
in the \ref analyses directory.
4 changes: 4 additions & 0 deletions src/goto-cc/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
\ingroup module_hidden
\defgroup goto-cc goto-cc

# Folder goto-cc

\author Martin Brain

`goto-cc` is a compiler replacement that just performs the first step of
the process; converting C or C++ programs to goto-binaries. It is
intended to be dropped in to an existing build procedure in place of the
Expand All @@ -11,3 +14,4 @@ the `goto-cc/` binary. If it is called `goto-cc` then it emulates GCC
flags, `goto-armcc` emulates the ARM compiler, `goto-cl` emulates VCC
and `goto-cw` emulates the Code Warrior compiler. The output of this
tool can then be used with `cbmc` or `goto-instrument`.

6 changes: 4 additions & 2 deletions src/goto-diff/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
\ingroup module_hidden
\defgroup goto-diff goto-diff

# Folder goto-diff

`goto-diff/` is a tool that offers functionality similar to the `diff`
tool, but for GOTO programs.


`goto-diff/` is a module has files which change frequently and are medium
risk.
3 changes: 3 additions & 0 deletions src/goto-instrument/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
\ingroup module_hidden
\defgroup goto-instrument goto-instrument

# Folder goto-instrument

\author Martin Brain

The `goto-instrument/` directory contains a number of tools, one per
file, that are built into the `goto-instrument` program. All of them
take in a goto-program (produced by `goto-cc`) and either modify it or
Expand Down
101 changes: 100 additions & 1 deletion src/goto-programs/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,108 @@
\ingroup module_hidden
\defgroup goto-programs goto-programs

# Folder goto-programs

\author Kareem Khazem, Martin Brain

\section overview Overview
Goto programs are the intermediate representation of the CPROVER tool
chain. They are language independent and similar to many of the compiler
intermediate languages. Section \ref goto-programs "goto-programs" describes the
`goto_programt` and `goto_functionst` data structures in detail. However
it useful to understand some of the basic concepts. Each function is a
list of instructions, each of which has a type (one of 18 kinds of
instruction), a code expression, a guard expression and potentially some
targets for the next instruction. They are not natively in static
single-assign (SSA) form. Transitions are nondeterministic (although in
practise the guards on the transitions normally cover form a disjoint
cover of all possibilities). Local variables have non-deterministic
values if they are not initialised. Variables and data within the
program is commonly one of three types (parameterised by width):
`unsignedbv_typet`, `signedbv_typet` and `floatbv_typet`, see
`util/std_types.h` for more information. Goto programs can be serialised
in a binary (wrapped in ELF headers) format or in XML (see the various
`_serialization` files).

The `cbmc` option `–show-goto-programs` is often a good starting point
as it outputs goto-programs in a human readable form. However there are
a few things to be aware of. Functions have an internal name (for
example `c::f00`) and a ‘pretty name’ (for example `f00`) and which is
used depends on whether it is internal or being presented to the user.
The `main` method is the ‘logical’ main which is not necessarily the
main method from the code. In the output `NONDET` is use to represent a
nondeterministic assignment to a variable. Likewise `IF` as a beautified
`GOTO` instruction where the guard expression is used as the condition.
`RETURN` instructions may be dropped if they precede an `END_FUNCTION`
instruction. The comment lines are generated from the `locationt` field
of the `instructiont` structure.

`goto-programs/` is one of the few places in the CPROVER codebase that
templates are used. The intention is to allow the general architecture
of program and functions to be used for other formalisms. At the moment
most of the templates have a single instantiation; for example
`goto_functionst` and `goto_function_templatet` and `goto_programt` and
`goto_program_templatet`.

\section data_structures Data Structures

FIXME: This text is partially outdated.

The common starting point for working with goto-programs is the
`read_goto_binary` function which populates an object of
`goto_functionst` type. This is defined in `goto_functions.h` and is an
instantiation of the template `goto_functions_templatet` which is
contained in `goto_functions_template.h`. They are wrappers around a map
from strings to `goto_programt`’s and iteration macros are provided.
Note that `goto_function_templatet` (no `s`) is defined in the same
header as `goto_functions_templatet` and is gives the C type for the
function and Boolean which indicates whether the body is available
(before linking this might not always be true). Also note the slightly
counter-intuitive naming; `goto_functionst` instances are the top level
structure representing the program and contain `goto_programt` instances
which represent the individual functions. At the time of writing
`goto_functionst` is the only instantiation of the template
`goto_functions_templatet` but other could be produced if a different
data-structures / kinds of models were needed for functions.

`goto_programt` is also an instantiation of a template. In a similar
fashion it is `goto_program_templatet` and allows the types of the guard
and expression used in instructions to be parameterised. Again, this is
currently the only use of the template. As such there are only really
helper functions in `goto_program.h` and thus `goto_program_template.h`
is probably the key file that describes the representation of (C)
functions in the goto-program format. It is reasonably stable and
reasonably documented and thus is a good place to start looking at the
code.

An instance of `goto_program_templatet` is effectively a list of
instructions (and inner template called `instructiont`). It is important
to use the copy and insertion functions that are provided as iterators
are used to link instructions to their predecessors and targets and
careless manipulation of the list could break these. Likewise there are
helper macros for iterating over the instructions in an instance of
`goto_program_templatet` and the use of these is good style and strongly
encouraged.

Individual instructions are instances of type `instructiont`. They
represent one step in the function. Each has a type, an instance of
`goto_program_instruction_typet` which denotes what kind of instruction
it is. They can be computational (such as `ASSIGN` or `FUNCTION_CALL`),
logical (such as `ASSUME` and `ASSERT`) or informational (such as
`LOCATION` and `DEAD`). At the time of writing there are 18 possible
values for `goto_program_instruction_typet` / kinds of instruction.
Instructions also have a guard field (the condition under which it is
executed) and a code field (what the instruction does). These may be
empty depending on the kind of instruction. In the default
instantiations these are of type `exprt` and `codet` respectively and
thus covered by the previous discussion of `irept` and its descendents.
The next instructions (remembering that transitions are guarded by
non-deterministic) are given by the list `targets` (with the
corresponding list of labels `labels`) and the corresponding set of
previous instructions is get by `incoming_edges`. Finally `instructiont`
have informational `function` and `location` fields that indicate where
they are in the code.

\author Kareem Khazem

\section goto-conversion Goto Conversion

Expand Down
20 changes: 18 additions & 2 deletions src/goto-symex/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,25 @@
\ingroup module_hidden
\defgroup goto-symex goto-symex
# Folder goto-symex

# Folder goto-symex

\author Kareem Khazem
\author Kareem Khazem, Martin Brain

This directory contains a symbolic evaluation system for goto-programs.
This takes a goto-program and translates it to an equation system by
traversing the program, branching and merging and unwinding loops as
needed. Each reverse goto has a separate counter (the actual counting is
handled by `cbmc`, see the `–unwind` and `–unwind-set` options). When a
counter limit is reach, an assertion can be added to explicitly show
when analysis is incomplete. The symbolic execution includes constant
folding so loops that have a constant number of iterations will be
handled completely (assuming the unwinding limit is sufficient).

The output of the symbolic execution is a system of equations; an object
containing a list of `symex_target_elements`, each of which are
equalities between `expr` expressions. See `symex_target_equation.h`.
The output is in static, single assignment (SSA) form, which is *not*
the case for goto-programs.

\section symbolic-execution Symbolic Execution

Expand Down
3 changes: 2 additions & 1 deletion src/java_bytecode/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
\ingroup module_hidden
\defgroup java_bytecode java_bytecode

# Folder java_bytecode


This module provide a front end for Java.
This module provides a bytecode-based front end for Java.
4 changes: 2 additions & 2 deletions src/jsil/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
\ingroup module_hidden
\defgroup jsil jsil
# Folder jsil

# Folder jsil

`jsil/` is a module that focuses on type checking.
`jsil/` contains a JavaScript front end.
2 changes: 1 addition & 1 deletion src/json/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
\defgroup json json
# Folder json

`json/` is a utility that processes json.
`json/` contains a JSON parser.
8 changes: 7 additions & 1 deletion src/langapi/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,12 @@
\ingroup module_hidden
\defgroup langapi langapi

# Folder langapi

\author Martin Brain

`langapi/` is a language front end.
`langapi/` contains the basic interfaces and support classes for programming
language front ends. Developers only really need look at this if they
are adding support for a new language. It’s main users are the
language front-ends such as `ansi-c/` and
`cpp/`.
8 changes: 7 additions & 1 deletion src/linking/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
\ingroup module_hidden
\defgroup linking linking

# Folder linking

linking docs: todo
\author Martin Brain

This allows multiple ‘object
files’ (goto-programs) to be linked into one ‘executable’ (another
goto-program), thus allowing existing build systems to be used to build
complete goto-program binaries.
4 changes: 2 additions & 2 deletions src/memory-models/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
\ingroup module_hidden
\defgroup memory-models memory-models
# Folder memory-models

# Folder memory-models

`memory-models` is a tool that works with memory.
`memory-models` contains tools related to weak memory models.
4 changes: 2 additions & 2 deletions src/miniz/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
\ingroup module_hidden
\defgroup miniz miniz
Folder miniz

# Folder miniz

`miniz/` is a utility for minimizing things.
`miniz/` contains a minimal ZIP compression library.
Loading

0 comments on commit 472f0ac

Please sign in to comment.