-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Abstract interpretation framework: allow selective domain retention #2572
Abstract interpretation framework: allow selective domain retention #2572
Conversation
@tautschnig @peterschrammel @thk123 this definitely needs a unit test, since it is not yet exercised in cbmc, but the implementation is complete and ready for review. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: 28c372d).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/78599041
Status will be re-evaluated on next push.
Please contact @peterschrammel, @thk123, or @allredj for support.
Common spurious failures:
- the cbmc commit has disappeared in the mean time (e.g. in a force-push)
- the author is not in the list of contributors (e.g. first-time contributors).
Will submit tests with this one then followup for constant prop (that patch will literally just pass through the predicate argument, since different constant prop users have different needs) I wouldn't say this does summarisation, rather it lets you tell ait that you don't intend to make full use of the random access (by program point) interface it provides, therefore it can optimise to avoid storing those states you're not interested in if doing so isn't important for whatever fix point algorithm it uses. A user who was generating summaries would presumably analyse a function at a time and pass a predicate saying they only care about the end instruction. Other cases I've had need for so far:
|
Interesting and worthwhile although I think it will break the patch set
that I've been working on for a while now to be able to add history
sensitivity. It's a little late for me to try the integration now but
would it be possible for us to work on getting both working together?
|
Got a link to a branch? |
I've got some alarm bells going off here for me: given that this field has seen decades of research, is this an implementation of something that has been published before, or is it possibly leading to a publication? Or should the solution much rather be in using sharing maps that @danpoe is working on? |
I'd be very surprised if you could publish what to my mind is a fairly obvious optimisation. Dan's sharing maps are a great tool for when you do want to be able to access the result at every program point, and the program point results are closely related such that you can say "result at loc n+1 = result at loc n WITH x <- y" some or most of the time. On the other hand if you just don't need a result at most program points you can skip much of the complexity and do it like this. Re: published before, I think you could argue goto-symex is using a pretty similar idea (working in-place for the most part and copying it only when necessary) apart from the fact that it doesn't cater for back-edges due to its unrolling approach. |
@smowton : working on it but running SC^2 today and presenting at SMT tomorrow so it might not be today. @tautschnig : as I see it, sharing maps are orthogonal to this optimisation. They help reduce the amount of memory used to store domains; this reduces the number of domains stored. As for publication ... if you know of a good source of published material on the actual practicalities of implementing an abstract interpreter I'd be interested to read it. The thing I've been working on I am hoping to publish. |
Typically, an abstract interpreter stores the abstract state only at CFG cut points (eg loop heads and call sites of recursive functions), minimally required to decide when you have reached the fixed point (we don't have that optimisation right now - maybe @martin-cs has it in his implementation). Additionally, you might want to remember the abstract state in certain locations of interest (I think that's the intention of @smowton). |
I somewhat approximate the cut points as "anything with multiple predecessors" because we lack a loop analysis, but otherwise yes that's what I'm doing. |
28c372d
to
6b7b504
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: 6b7b504).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/78711776
Status will be re-evaluated on next push.
Please contact @peterschrammel, @thk123, or @allredj for support.
Common spurious failures:
- the cbmc commit has disappeared in the mean time (e.g. in a force-push)
- the author is not in the list of contributors (e.g. first-time contributors).
6b7b504
to
d7b4b91
Compare
My hesitation remains: the code and interface are becoming more complicated, and that's in a way that is non-standard. If adopting the terminology of "cut points" some of this is remedied. With regard to @martin-cs' comment: yes, I acknowledge that there is a difference between number-of-domains-stored vs amount-of-storage-required-for-domains, but the net effect to the end user isn't actually different? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: d7b4b91).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/78716828
Status will be re-evaluated on next push.
Please contact @peterschrammel, @thk123, or @allredj for support.
Common spurious failures:
- the cbmc commit has disappeared in the mean time (e.g. in a force-push)
- the author is not in the list of contributors (e.g. first-time contributors).
d7b4b91
to
82f11c9
Compare
This allows users of the `ait` abstract interpretation framework to specify that only certain states are needed after its fixpoint computation is finished. For example, perhaps a particular analysis, or a particular user of that analysis, only needs to know the results at END_FUNCTION instructions. In this case when a state does not need to be retained, and only has a single successor, that state will be altered in place and passed to its successor instruction using std::move, permitting an analysis to optimise away much (potentially all) state copying. In the best case (a long, straight-line function whose state grows with every instruction, such as a constant propagator reading "int x = 1; int y = 2; int z = 3; ...") this reduces analysis space and time complexity from quadratic to linear.
82f11c9
to
a87ef74
Compare
@peterschrammel @tautschnig @martin-cs updated this to (a) deal with the interprocedural edges correctly, and (b) include a reasonably thorough unit test. As structured it should be unable to break any existing users, since per default it retains all states. @tautschnig I'm overapproximating what constitutes a cut-point, but I can add comments to that effect if you like? Re: sharing-map vs. this solution, this is just easier to implement, both for me and for the author / user of an @peterschrammel I'm actually retaining state even at non-loop convergences (e.g. an if/else convergence) because I can't generally depend on instructions being visited in toplogical order due to backward GOTOs that aren't loop back-edges -- is there a formal way to describe that policy? I was thinking of invariants that need to be true for this to work: I think I just need that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: 82f11c9).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/78722174
Status will be re-evaluated on next push.
Please contact @peterschrammel, @thk123, or @allredj for support.
Common spurious failures:
- the cbmc commit has disappeared in the mean time (e.g. in a force-push)
- the author is not in the list of contributors (e.g. first-time contributors).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR failed Diffblue compatibility checks (cbmc commit: a87ef74).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/78722466
Status will be re-evaluated on next push.
Please contact @peterschrammel, @thk123, or @allredj for support.
Common spurious failures:
- the cbmc commit has disappeared in the mean time (e.g. in a force-push)
- the author is not in the list of contributors (e.g. first-time contributors).
(If @tautschnig and I have reached an impasse, @peterschrammel @martin-cs could you provide tie-breaking opinions?) |
To break dependencies and avoid cluttering of existing interfaces, would it be possible to introduce your optimisation as an extra layer (ie extend rather than modify) so that existing analyses may derive from the interface that offers the optimisation or the original one? |
On Thu, 2018-07-12 at 14:58 +0000, Chris Smowton wrote:
(If @tautschnig and I have reached an impasse, @peterschrammel
@martin-cs could you provide tie-breaking opinions?)
@peterschrammel and I discussed this at lunch time. From my point of
view we concluded the following:
*. We need to be careful with the interface; we don't want to wind up
with ait in the same mess as bmct.
*. The functionality is useful and worth having but must be optional.
*. It is not clear what the best way of integrating this is.
I think I volunteered (now I am not running a workshop) to try to
integrate this in my patch set.
|
@peterschrammel @martin-cs This functionality is already optional and default-off: the must-retain-state predicate is supplied as a constructor parameter that defaults to null (== retain all states, old behaviour). Implementing a move-merge operator on your domain is also optional (and orthogonal -- the combinations (move-merge, retain-all-states) and (copy-merge, retain-some-states) are both faster than the default (copy-merge, retain-all-states). |
@peterschrammel I'm not convinced inheriting from |
@smowton : I realise this is time critical / blocking for you but I have run out of time this weekend to work on this. I'm going to start skipping sessions tomorrow until I have some code for you to look at. |
Thanks @peterschrammel this is on my immediate TODO anyway... |
@martin-cs had any chance to work on history-sensitivity lately? |
On Wed, 2019-03-06 at 16:10 +0000, Chris Smowton wrote:
@martin-cs had any chance to work on history-sensitivity lately?
Actually yes! More time than I have had to catch up on my CPROVER
correspondence any way! Getting the history framework done and merged
is now my primary development task. @tautschnig has been very helpful
in shaping my thoughts and I think I now have a way of getting all of
the benefits of dropping domains but also allowing on-the-fly
recomputation so the that users of ait are none the wiser. I'm going
to refactor the current history sensitive framework over the next
little while and was then thinking I might ping you, Peter, Daniel P,
Chris R and Thomas for opinions.
|
Closing this PR as it appears that no conclusion on how to progress/merge this work was both reached and followed through upon. If you believe this is erroneous please reopen and update. |
Just for conclusion. The history patches did get merged and there is #2572 (comment) interface in the storage. To implement what @chrisr-diffblue originally asked for, one would simply need to:
If you prune every location apart from the last then you will get the biggest saving without loosing accuracy. If you prune every location with one inbound edge (apart from first and last) then you should avoid recomputing anything if it is called again. If you are feeling really really fancy then you can hook |
This allows users of the
ait
abstract interpretation framework to specify that only certain states are needed afterits fixpoint computation is finished. For example, perhaps a particular analysis, or a particular user of that analysis,
only needs to know the results at END_FUNCTION instructions. In this case when a state does not need to be retained, and
only has a single successor, that state will be altered in place and passed to its successor instruction using std::move,
permitting an analysis to optimise away much (potentially all) state copying.
In the best case (a long, straight-line function whose state grows with every instruction, such as a constant propagator
reading "int x = 1; int y = 2; int z = 3; ...") this reduces analysis space and time complexity from quadratic to linear.