C-2PO: Thesis About a Weakly-Relational Pointer Analysis #1485

reb-ddm · 2024-05-24T10:02:10Z

This PR introduces a weakly-relational analysis between pointers (ref: 2-pointer Logic by Seidl et al.).
The analysis can infer must-equalities and must-disequalities between terms which are built from pointer variables, with the addition of constants and dereferencing.
Moreover, it computes pairs of variables that are in distinct memory blocks, for example if they were initialized with two different calls to malloc.
An example of a property that the analysis can infer is *(x + 1) = 3 + y and a != 5 + *(y + 1).

Abstract states are represented by a congruence closure of terms over the uninterpreted function *.
Additionally, a list of disequalities and a list of block disequalities between terms is stored.

The disequalities are important during the assignment operation, because when we assign a value to an address on the heap, we need to forget the information we had about all terms thay may alias with this address.
Additionally to the (block) disequalities inferred by this analysis, we exploit the May-Point-To analysis in order to derive additional disequalities between terms.
Using the May-Point-To analysis can be disabled with the option ana.c2po.askbase.

There are two version of the join and two versions of the equal function that are implemented.
They can be chosen via the options ana.c2po.precise_join and ana.c2po.normal_form, respectively.
They are disabled by default.
If precise_join is enabled, the join is calculated using the quantitative finite automaton. Otherwise, only the partition is considered for the join, which is a bit less precise.
If normal_form is enabled, the equality of two congruence closures is decided by calculating a normal form, instead of
comparing the equivalence classes. The normal form is computed lazily and doesn't need to be recomputed when we call equal on the same domain element.

The implementation of enter duplicates the parameter variables of each function in order to infer in combine which terms are related to the initial parameters.
In order to use the May-Point-To analysis also on these duplicated variables, an additional analysis startState is added.
It remembers the May-Point-To information of the duplicated variables at the beginning of the function.

Here is a more detailed description of the files that are added:

The file unionFind.ml contains the code for a quantitative union find and the quantitative finite automata.
They will be necessary in order to construct the congruence closure of terms.
The contained modules are:

T: here, the terms (e.g., *(x + 64)) and propositions (e.g., *(x + 64) = 192 + y) are defined. There are also function to convert a CIL expression to a term and a term to a CIL expression.
The offsets in the terms are expressed in bits, therefore they are equal to the offset of the CIL expression multiplied with the type of the element that the variable points to.
Each term stores the information about the CIL expression that was used to create the term.
This way, it is easier to reconvert a term to a CIL expression and to get information about the type of a term.
Only the terms that are pointers or arrays or structs or 64-bit integers are considered by the analysis.
Arrays and structs are interpreted as pointers, e.g. a[3] is interpreted as the term *(a + 3).
UnionFind: the union-find data structure is defined with terms as elements.
LookupMap: this map represents the transitions of the quantitative finite automata.
Each term t is mapped to the terms *(z + t) or equal terms.

The file congruenceClosure.ml contains the data structures for the C-2PO Domain, i.e., the congruence closure, the disequalities and the block disequalities:

BlDis represents the block disequalities as a map that maps each term to a set of terms that are definitely in a different block.
Disequalities represents the disequalities as a map from a term to a map from an integer to another term.
The module contains functions to compute the closure of the disequalities that are implied by equalities, disequalities or block disequalities.
SSet: a set of terms that are currently considered by the analysis.
MRMap: maps each equivalence class to a minimal representative. This is necessary for computing the normal form.
There are functions to calculate the closures of the proposition, to insert terms, add propositions, remove terms, two methods to compute join and two methods to compute equal.
MayBeEqual: contains code to check if two terms may point to the same address or if they may overlap. It uses information from the disequalities and from MayPointTo for this purpose.

The file c2poDomain.ml defines the domain operations.

c2poAnalysis.ml contains the transfer functions for the analysis.

duplicateVars.ml contains functions for duplicating variables, which is used in enter in the C-2PO analysis and in the StartState analysis.

startStateAnalysis.ml remembers the value of each parameter at the beginning of a function. It answers the query May-Point-To for the duplicated variables and returns the initial value of the original variable.

singleThreadedLifter.ml transforms any analysis into a single threaded analysis by returning top any time the code might be multi-threaded. This can be reused by other analyses in the future.

src/cdomains/congruenceClosure.ml

michael-schwarz · 2024-05-24T10:33:00Z

As almost all files are new here, it should be easy to merge master and add startcontext (c.f. #1427), so we can see if the CI passes then.

…n_repr instead of the representatives in order to represent the disequalities in the normal form.

…not created by wrpointer

…-restriction

src/cdomains/congruenceClosure.ml

… of (=).

…ied.

…ear. The code previously relied on the right-to-left evaluation order of OCaml.

Do to PR goblint#1679, where offsets are now computed in bytes in the Offset Domain, and C2PO calculating with bits, a helper function is introduced to perform the conversion from bytes to bits.

…fo vars. The C2PO analysis maps some varinfos to varinfos via the RichVarinfo module. The code used to read some attributes from these variables from the result varinfos. Thus, it required a change where the all the attributes of resulting rich varinfos can be specified. With this change, this is no longer necessary.

jerhard

I adapted the PR such that the code is a bit easier to read.

We discussed in this comment to change the handling of function calls. I did not perform these changes now, as currently we don't have big further plans with this domain.
A more precise handling of function calls may still be interesting in the future.

jerhard · 2025-02-18T07:45:38Z

I don't quite see what the problem with the failing regression test is? It seems like the ordering of race warnings somehow has changed in one test case?

michael-schwarz · 2025-02-18T07:59:16Z

It may be due to me enabling warn.deterministic on master for some tests - I merged that commit in now, so let's see!

michael-schwarz · 2025-02-18T08:35:39Z

That seems to indeed have been it!

github-advanced-security bot found potential problems May 24, 2024

View reviewed changes

sim642 added feature student-job relational Relational analyses (Apron, affeq, lin2var) labels May 27, 2024

reb-ddm added 24 commits June 13, 2024 21:45

fixed division by zero error and some wrong comparisons

e8d9224

add small test for disequalities

b7d8931

properly update the disequalities after a union operation. Use the mi…

9dca5e1

…n_repr instead of the representatives in order to represent the disequalities in the normal form.

solved last remaining SizeOfErrors

a0fd9a2

use correct compare function

061f2b6

add regressio test for the compare function bug

fd95f86

fixed inconsistency in disequalities

8fed940

removed warning for Field on a non-compound

83bb4cf

fix issue where startstate answers queries about variables that were …

0c41a16

…not created by wrpointer

I'm not sure what to write for the thread functions

b4b6712

catch sizeOfError

101e247

made Lookup Map with just one successor, as it was in the beginning

0182296

new method of restricting the automaton

06f31d8

ignore floats and catch an exception

8dbbe48

fix bugs with offsets

afaf409

add conf file for base analysis with which we can compare wrpointer

104360c

add tracing for get_normal_form

50f9671

Merge branch 'thesis-weakly-relational-pointer' into thesis-wrpointer…

b98d18f

…-restriction

made sure to always just add pointers to the data structure

e758b5e

fixed bug in find of union-find

9e8c10b

properly update offset of long integers

12f64f2

add regression test for widen

d72d622

revert wrong fix

2b4fdd1

adapt test case

2135fea

github-advanced-security bot found potential problems Jun 25, 2024

View reviewed changes

src/cdomains/congruenceClosure.ml Fixed Show fixed Hide fixed

src/cdomains/congruenceClosure.ml Fixed Show fixed Hide fixed

jerhard added 23 commits February 14, 2025 10:18

C2PO: Improve readability of some functions in c2poanalysis.

dd4556d

C2PO: Improve readability of c2poanalysis, rename t to cc.

0fb3b80

Put M.trace in one line.

e2e2e3d

C2PO tracing: Use lower casing instead of UPPERCASING for tracing.

05e64cc

C2PO DuplicateVars: Remark todos for incremental, improve readability.

5df3b5e

C2PO DuplicateVars: Add newlines for better readability.

1c36383

C2PO: Improve readability of some functions in UnionFind.

d869801

C2PO: Fix non-termination of type comparison by using compare instead…

e093e5b

… of (=).

C2PO: Improve readability of unionFind.

0cdd288

Fix get_representatives.

3256cf8

Rename variables in get_representtives.

6dd7315

C2PO: Define types for arguments of prop constructors.

59d9fb5

Remove outdated comment.

9fe6048

C2PO: Adapt comment about duplication, since duplicated code is modif…

bbfffe1

…ied.

C2PO: Fix comment of UnionFind.find

1c667a7

C2PO: Fix comment for find_no_pc.

db77cfb

C2PO: Make evaluation of last line of remove_terms_from_bldis more cl…

0f9c8ac

…ear. The code previously relied on the right-to-left evaluation order of OCaml.

Merge master into c2po.

11e3667

Do to PR goblint#1679, where offsets are now computed in bytes in the Offset Domain, and C2PO calculating with bits, a helper function is introduced to perform the conversion from bytes to bits.

C2PO: Fix naming of DuplicVar.

1377a28

Merge branch 'master' into c2po

a434566

Remove not useful test case.

cd52bde

C2PO: Fix test case to incur no warnings, besides for __goblint_check.

5f9b2d7

jerhard reviewed Feb 17, 2025

View reviewed changes

jerhard approved these changes Feb 17, 2025

View reviewed changes

Merge branch 'master' into thesis-weakly-relational-pointer

3ad5c05

sim642 self-requested a review February 18, 2025 09:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C-2PO: Thesis About a Weakly-Relational Pointer Analysis #1485

C-2PO: Thesis About a Weakly-Relational Pointer Analysis #1485

reb-ddm commented May 24, 2024 •

edited

Loading

michael-schwarz commented May 24, 2024

jerhard left a comment

jerhard commented Feb 18, 2025

michael-schwarz commented Feb 18, 2025

michael-schwarz commented Feb 18, 2025

C-2PO: Thesis About a Weakly-Relational Pointer Analysis #1485

Are you sure you want to change the base?

C-2PO: Thesis About a Weakly-Relational Pointer Analysis #1485

Conversation

reb-ddm commented May 24, 2024 • edited Loading

michael-schwarz commented May 24, 2024

jerhard left a comment

Choose a reason for hiding this comment

jerhard commented Feb 18, 2025

michael-schwarz commented Feb 18, 2025

michael-schwarz commented Feb 18, 2025

reb-ddm commented May 24, 2024 •

edited

Loading