Taint analysis to improve the precision loss of partial contexts #952

FelixKrayer · 2022-12-20T17:29:23Z

Adding a new analysis taintPartialContexts to track modified lvalues per function. The gained information is used in base, relationAnalysis.apron, condVars and varEq to keep unwritten values. (Issue #553)

The analysis taintPartialContexts tracks a set of modified values for each function, starting with the empty set when entering a function. This set can be accessed via a new query MayBeTainted, which is used in the combine of the four mentioned analyses to improve precision when partial contexts are used.

base: the cpa from the caller is updated:
- everything is removed, which does not exist in the callee anymore
- everything is added from callee, which was not in caller before (new info from callee is always better)
- folding over the tainted set, caller is updated lval-wise for every lval that is tainted
relationAnalysis.apron: from the caller only the globals and reachable locals are removed, which are in the tainted set. The resulting state is unified with the callee state (only making the caller state more precise)
condVars: only untainted information is kept. Callee state is not merged.
varEq: only untainted information is kept from caller. Callee state is merged

A necessary but noteworthy change was the addition of the (f_ask : Queries.ask) to the signature of the combine function, which allows querying the return state of the called function.

Regression tests for each analysis have been added as well as some for specific edge cases.

michael-schwarz · 2022-12-21T07:23:55Z

Thank you for the PR! I'll most likely only get around to review it in the first week of January.

michael-schwarz

After this small typo is fixed, it looks good to me!

src/analyses/abortUnless.ml

sim642

Conceptually the taintPartialContexts analysis is quite similar to our access analysis: recording accessed lvals. It probably would be possible to merge the two, but that's not necessary now.

src/framework/constraints.ml

src/analyses/base.ml

sim642 · 2023-01-09T09:07:02Z

src/analyses/base.ml

+        | Some (`Array _) when (get_string "ana.base.arrays.domain") = "partitioned" -> begin
+          (* partitioned arrays cannot be copied by individual lvalues, so if tainted just copy the whole callee value for the array variable *)
+          let new_arry_opt = CPA.find_opt v fun_st.cpa in
+          match new_arry_opt with
+          | None -> st
+          | Some new_arry -> {st with cpa = CPA.add v new_arry st.cpa}
+          end


I'm just wondering, is there anything that could be done to avoid having to special case this outside of the array domain?

It would be possible to taint the whole array when a lval inside a partitioned array is written.
However special casing this inside the taintPartialContext analysis does not seem like a better approach, as there might be other analyses that can benefit from the finer taint information for partitioned arrays.

Other than that I was not able to come up with a better idea to address this issue.

src/analyses/taintPartialContexts.ml

michael-schwarz · 2023-01-16T10:04:02Z

src/analyses/base.ml

+        if M.tracing then M.trace "taintPC" "updating %a; type: %a\n" Lval.CilLval.pretty (v, o) d_type lval_type;
+        match CPA.find_opt v (fun_st.cpa) with
+        | None -> st
+        | Some (`Array _) when (get_string "ana.base.arrays.domain") = "partitioned" -> begin


In the latest version, some arrays may be partitioned, while others are not: One would need to check for this specific array here.

I have fixed this by adding domain_of_t to the signature of ArrayDomain.S and using this to check if the Array is partitioned:
| Some ('Array a) when (CArrays.domain_of_t a) = PartitionedDomain ->

A different approach I thought of was to use ArrayDomain.get_domain:
(ArrayDomain.get_domain ~varAttr:v.vattr ~typAttr:(typeAttrs lval_type)) = PartitionedDomain

I chose the first idea as it seemed more straightforward to "ask the array what it is" instead of inspecting the attributes, but I will of course change the approach if you prefer not adding a function to the ArrayDomain.S signature.

sim642

Has there been an SV-COMP run with this analysis enabled? Given that a regression test from sv-benchmarks has been adapted, I would guess so?

michael-schwarz · 2023-02-03T13:59:27Z

Yes, here they are: results-felix-all.zip.

The first run is: base ctx_insens, no taint
The second one is: base ctc_insens, taint
The third one is: base ctc_sens, no taint

michael-schwarz · 2023-02-15T10:10:51Z

@sim642 Would you mind doing your review of this at your earliest convenience? I think I will be able to build on this for the detection if non-volatile locals are modified since a longjmp/setjmp for #970.

sim642

The test group should be changed from 64-taint to 65-taint, because there's 64-noreturn now. Other than that this should be good.

FelixKrayer · 2023-02-15T13:09:53Z

I rebased onto the master and renamed the test directory.
Additionally I changed a small thing in the base combine, so that new mappings that are also tainted are not added to the callee state and then updated again in combine_st. Changes nothing in the result but removes unnecessary computations

included in base and apron analysis

…ble}

+ change analysis to use identity spec

- in base, the tainted set is filtered, so that completely new values from callee are not in the tainted set and copied again

sim642 · 2023-02-17T12:02:23Z

Somehow 65-taint/04-multithread marshaling fails on MacOS: https://github.com/goblint/analyzer/actions/runs/4203078719.

FelixKrayer added student-job precision labels Dec 20, 2022

michael-schwarz linked an issue Dec 21, 2022 that may be closed by this pull request

Track written lvalues per function and use this information in Base.combine #553

Closed

FelixKrayer force-pushed the taint branch from 6b17fb9 to 700facf Compare December 21, 2022 09:28

FelixKrayer force-pushed the taint branch from 7db2adb to dd759f9 Compare January 2, 2023 18:59

michael-schwarz requested review from sim642 and michael-schwarz January 4, 2023 12:33

michael-schwarz approved these changes Jan 5, 2023

View reviewed changes

src/analyses/abortUnless.ml Outdated Show resolved Hide resolved

sim642 reviewed Jan 9, 2023

View reviewed changes

FelixKrayer force-pushed the taint branch from dd759f9 to 364c3a4 Compare January 10, 2023 08:24

michael-schwarz reviewed Jan 16, 2023

View reviewed changes

FelixKrayer force-pushed the taint branch 2 times, most recently from 1dcc477 to 86ac4fd Compare January 18, 2023 10:42

FelixKrayer requested a review from michael-schwarz January 18, 2023 10:42

FelixKrayer force-pushed the taint branch from 86ac4fd to 9de1305 Compare January 19, 2023 14:45

sim642 self-requested a review January 19, 2023 14:45

sim642 reviewed Feb 2, 2023

View reviewed changes

sim642 self-requested a review February 9, 2023 10:26

michael-schwarz approved these changes Feb 15, 2023

View reviewed changes

sim642 approved these changes Feb 15, 2023

View reviewed changes

FelixKrayer force-pushed the taint branch from 1f432ec to 164c9b0 Compare February 15, 2023 13:07

FelixKrayer added 5 commits February 15, 2023 14:19

Basic implementation of taint analyis

3c1b30c

included in base and apron analysis

change TaintPC to work with {Lval.CilLval} instead of {Basetype.Varia…

73b57f2

…ble}

handle partDeps

cc010c3

use taint information in condVars and varEq analysis (+tests added)

ef09917

tests, fixes and commenting

f59795b

FelixKrayer added 7 commits February 15, 2023 14:19

change test folder to avoid duplicate ID

f2527e9

rebase onto master

d7e837d

implement review suggestions

b13e0ea

partitioned Array fix

9337989

there is no Context for taintPartialContexts analysis

35816ee

+ change analysis to use identity spec

fix void bug and add test

09a4052

rebase onto master, rename test folder, small change

1b2dc24

- in base, the tainted set is filtered, so that completely new values from callee are not in the tainted set and copied again

FelixKrayer force-pushed the taint branch from 164c9b0 to 1b2dc24 Compare February 15, 2023 13:24

add fask to threadJoins analysis

b663598

michael-schwarz requested review from sim642 and michael-schwarz February 15, 2023 14:05

sim642 approved these changes Feb 15, 2023

View reviewed changes

michael-schwarz approved these changes Feb 16, 2023

View reviewed changes

michael-schwarz merged commit 3fc209f into goblint:master Feb 16, 2023

michael-schwarz added a commit that referenced this pull request Feb 16, 2023

Fix indentation (#952)

fe66724

sim642 added a commit that referenced this pull request Feb 16, 2023

Fix more indentation (PR #952)

0bef63c

FelixKrayer deleted the taint branch March 1, 2023 14:31

sim642 mentioned this pull request Mar 20, 2023

Analysis of longjmp/setjmp #970

Merged

10 tasks

sim642 added this to the v2.2.0 milestone Apr 5, 2023

sim642 mentioned this pull request Sep 13, 2023

[new release] goblint (2.2.0) ocaml/opam-repository#24420

Closed

michael-schwarz mentioned this pull request Jan 30, 2024

master thesis: Taming Recursion with Three Context-Sensitive Analyses (Callstring, LoopfreeCallstring, Context Gas) #1340

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Taint analysis to improve the precision loss of partial contexts #952

Taint analysis to improve the precision loss of partial contexts #952

FelixKrayer commented Dec 20, 2022

michael-schwarz commented Dec 21, 2022

michael-schwarz left a comment

sim642 left a comment

sim642 Jan 9, 2023

FelixKrayer Jan 18, 2023

michael-schwarz Jan 16, 2023

FelixKrayer Jan 18, 2023

sim642 left a comment

michael-schwarz commented Feb 3, 2023

michael-schwarz commented Feb 15, 2023

sim642 left a comment

FelixKrayer commented Feb 15, 2023

sim642 commented Feb 17, 2023

Taint analysis to improve the precision loss of partial contexts #952

Taint analysis to improve the precision loss of partial contexts #952

Conversation

FelixKrayer commented Dec 20, 2022

michael-schwarz commented Dec 21, 2022

michael-schwarz left a comment

Choose a reason for hiding this comment

sim642 left a comment

Choose a reason for hiding this comment

sim642 Jan 9, 2023

Choose a reason for hiding this comment

FelixKrayer Jan 18, 2023

Choose a reason for hiding this comment

michael-schwarz Jan 16, 2023

Choose a reason for hiding this comment

FelixKrayer Jan 18, 2023

Choose a reason for hiding this comment

sim642 left a comment

Choose a reason for hiding this comment

michael-schwarz commented Feb 3, 2023

michael-schwarz commented Feb 15, 2023

sim642 left a comment

Choose a reason for hiding this comment

FelixKrayer commented Feb 15, 2023

sim642 commented Feb 17, 2023