-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[depends: #472] Variable Sensitivity Domain #708
[depends: #472] Variable Sensitivity Domain #708
Conversation
Before this gets merged, @martin-cs wanted to modify how the command line arguments are handled to mean we don't have to break backwards compatibility when more sensitivities are added. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Preliminary review of changes to relevant files for first PR.
|
||
#include "abstract_enviroment.h" | ||
|
||
#define DEBUG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't #define debug
|
||
#define DEBUG | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many empty lines.
\*******************************************************************/ | ||
|
||
abstract_object_pointert abstract_environmentt::eval( | ||
const exprt &e, const namespacet &ns) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use single letter variable names.
|
||
\*******************************************************************/ | ||
|
||
#include <iostream> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This include shouldn't be here.
|
||
std::stringstream merge_message; | ||
|
||
ait<constant_propagator_domaint> ai; // TODO - remove! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove!
else | ||
{ | ||
return sharing_ptrt<struct_abstract_objectt>( | ||
new struct_abstract_objectt(type(), false, true)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again this is wrong - we should return top unless we are bottom
throw "exceptions not handled"; | ||
|
||
case OTHER: | ||
// throw "other"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martin-cs I had to comment this out early on as there are other instructions inside the CPROVER_Initailize I think - I suppose this is OK?
locationt to) | ||
{ | ||
#ifdef DEBUG | ||
std::cout << "merging " << from->source_location.get_line() << "->" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use source location to be consistent with other debug output
return abstract_state.is_top(); | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too many empty lines
bool variable_sensitivity_domaint::ai_simplify( | ||
exprt &condition, const namespacet &ns, const bool lhs) const | ||
{ | ||
if (lhs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To clarify the scope of this should be moved into a separate function ai_simplify_lhs
The following tests are failing and need fixing:
|
Just reading the list of commits, there seems to be a lot of room for merging commits. Probably addressing the regression test failures is the first order of business, but then really a cleanup of those commits is due. |
Fixed or disabled all the linting errors bar one - which was in a multi-line define so can't put the @tautschnig Before I start squashing, would you like this as separate PRs as suggested above? |
All the tests fail on Alpine Linux for an as of yet unknown reason. |
+#include <iostream>
+
+bool abstract_environmentt::merge(const abstract_environmentt &env)
+{
+ // Use the sharing_map's "iterative over all differences" functionality
+ // This should give a significant performance boost
+ // We can strip down to just the things that are in both
+
+ // for each entry in the incoming environment we need to either add it
+ // if it is new, or merge with the existing key if it is not present
+
+ std::stringstream merge_message;
+
+ ait<constant_propagator_domaint> ai; // TODO - remove!
Remove!
Yes!
> +
+ case RETURN:
+ throw "return instructions are depreciated";
+
+ case START_THREAD:
+ case END_THREAD:
+ case ATOMIC_BEGIN:
+ case ATOMIC_END:
+ throw "threading not supported";
+
+ case THROW:
+ case CATCH:
+ throw "exceptions not handled";
+
+ case OTHER:
+// throw "other";
@martin-cs I had to comment this out early on as there are other instructions inside the CPROVER_Initailize I think - I suppose this is OK?
So there are a number of issues here...
1. Can we assume that return, throw/catch, function pointers,
vectors/complex, etc. have been removed? I think this has to be case by
case.
1.A. return : cheap and exact to remove and later replace, makes the
analysis significantly simpler. I think we assume that this has been
done. Thus we should throw an exception if we find one and the
exception / code should document this.
1.B. throw/catch : This one is a bit more tricky as the removal process
would really benefit from being able to run abstract interpretation,
see:
#510
So I think, ideally, we should handle throw/catch. We should probably
create an enhancement issue for this and make sure the code says "this
should be supported one day ... but isn't now".
1.C. function pointers : Same argument as 1.B. the issues are already
raised:
https://github.com/diffblue/cbmc-toyota/issues/31 (workaround)
https://github.com/diffblue/cbmc-toyota/issues/30 (fix)
1.D. vectors/complex : I feel these are more like 1.A. so no action is
required (other than calling remove_vectors in goto-analyze which we
already do).
1.E. Concurrency ... should be handled in a different way. We detect
concurrent programs and use the right ait<> variant. Once we've done:
https://github.com/diffblue/cbmc-toyota/issues/85
we should be able to use this domain in that case but still start and
end thread should not need to be supported.
2. What should we do about unsupported instruction types? Do we ignore
it so that we are robust to changes in other parts of the code base, or
do we throw an exception. I'm inclined to say we ignore OTHER as it is
the 'escape hatch' for this kind of functionality but to throw an
exception for any other case. But I'm open to suggestions for
alternatives.
Ideally we should probably "capture" this "design logic" in the code.
+bool variable_sensitivity_domaint::merge(
+ const variable_sensitivity_domaint &b,
+ locationt from,
+ locationt to)
+{
+ #ifdef DEBUG
+ std::cout << "merging " << from->source_location.get_line() << "->"
This should use source location to be consistent with other debug output
Yes; consistency is output is good.
> + ns - the namespace
+ lhs - is the expression on the left hand side
+
+ Outputs: True if simplified the condition. False otherwise. condition
+ will be updated with the simplified condition if it has worked
+
+ Purpose: Use the information in the domain to simplify the expression
+ with respect to the current location. This may be able to
+ reduce some values to constants.
+
+\*******************************************************************/
+
+bool variable_sensitivity_domaint::ai_simplify(
+ exprt &condition, const namespacet &ns, const bool lhs) const
+{
+ if (lhs)
To clarify the scope of this should be moved into a separate function `ai_simplify_lhs`
Perhaps we should continue this as part of
https://github.com/diffblue/cbmc-toyota/issues/108
|
On Tue, 2017-03-28 at 02:28 -0700, Thomas Kiley wrote:
<snip>
**Overview**
The variable sensitivity domain intends to replace all non-relational domains for goto-analyzer,
I might phrase this slightly differently. There is quite a lot of
infrastructure that one needs to implement a non-relational domain
correctly. For example, what you do about fields, setting things to top
when you write to an unknown pointer, element-wise merging, etc. The
vast majority of that is independent of what per-variable abstraction
you are using. The variable sensitivity domain is an implementation of
this infrastructure.
We have implemented constants and will implement intervals, in addition
to the existing domains and goto-analyze will probably be using these in
preference to the older ones but we didn't start out with the goal of
replacing them.
allowing flexibility in sensitivity at a very high grain of control.
(and thus a fair performance / accuracy trade-off)
There is some documentation in the header of `variable_sensitivity_domain.h`.
and elsewhere.
**Known limitations**
- Despite the flexibility of the architecture, not many options are implemented. Hopefully these will be coming soon (things like interval domain, squashed array etc)
and should be able to be implemented with minor / non-intrusive patches.
- Functions sensitivity is not implemented (though if this is needed it is basically done so I can easily add to this PR once it is).
Context sensitivity is something we will have to handle at the ait level
though.
- Soundness on top pointers, out of bounds writing to arrays
(and more generally, what we do about undefined behaviour).
**Controversial elements**
- Double dispatch in the merge_state method
- Each class requires hand written copy constructor
|
The failure on Alpine Linux doesn't seem to be strictly related to this PR, I have created a PR exhibiting the broken behaviour: #726 This is also broken on Windows, but apart from this problem this appears to be working correctly. |
Today I will attempt to clean up the commit history, let me know @tautschnig if you want this split up, but I am assuming not. |
On Thu, 2017-03-30 at 02:33 -0700, Thomas Kiley wrote:
Today I will attempt to clean up the commit history, let me know @tautschnig if you want this split up, but I am assuming not.
Splitting makes sense for unrelated changes. All of these are related
so a sequence is probably best.
|
Just as a progress update, I am getting there with the squashing, I am down to <100 reasonable commits which almost all compile, still to do is:
Then it should be good for a proper review - I'm still happy to split into functional parts to speed up the review process. |
35886b2
to
de4ffeb
Compare
Hopefully that is a sufficiently squashed commit history - I will now extract the changes that are part of the test PR |
@martin-cs You are looking at the first commit - this way madness lies! This is addressed in: Cleaned up the variable sensitivity framework later down. No reasonable way to squash this into the first commit, too much changed in the meantime whose history would be bad to lose. |
On Mon, 2017-04-03 at 07:28 -0700, Thomas Kiley wrote:
@martin-cs You are looking at the first commit - this way madness lies! This is addressed in: Cleaned up the variable sensitivity framework later down. No reasonable way to squash this into the first commit, too much changed in the meantime whose history would be bad to lose.
It's worse than that; I'm responding to e-mail with minimal contextual
information. Also we should get the ducks out of here.
|
de4ffeb
to
6560460
Compare
I have now done all the rebasing - once #472 is merged I will rebase onto master. The real diff is this: https://github.com/diffblue/cbmc/pull/708/files/93b9278..6560460 (I've updated the comment at the top). This should be good for a proper review at this stage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of errata in the tests
@@ -1,9 +1,9 @@ | |||
CORE | |||
main.c | |||
--constants --simplify out.gb | |||
--variable --simplify out.gb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like I missed swapping 16, 17 of these tests.
@@ -0,0 +1,119 @@ | |||
#include <assert.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this file even be checked in, probably not?
6560460
to
65fd967
Compare
Rebased on current version of #472 (and dropped tests that have been merged into master). Updated real diff: https://github.com/diffblue/cbmc/pull/708/files/8ed388d..65fd967 |
@tautschnig What can I do to help get this reviewed? The linting errors are not real but also not disable-able due to being in a preprocessor macro. The failures on AppVeyor and on Alpine Linux due to #726. I can apply a fix to this by replacing asserts with CPROVER_asserts or I can start digging into #726 if that is what this is waiting for. I can attempt to break this up into smaller PRs if it is too large to review (alternatively, I can advise on an order to review the files in). |
A couple of thoughts:
Right now, it's a bit hard to see where to start, as neither the list of commits nor the full list of changes would provide just the view on this particular aspect. If you can post an updated link that has just the changes for variable-sensitivity that would be very helpful! |
@tautschnig - thanks
Yup though at the moment this is out of my/Dans hands
Only bug fixes to this feature which seems nonsensical to split out.
I have to the best of my ability squashed the commits. As to where to start, this is the only sensible smaller diff I can provide which takes the feature up to point before including sensitivity for arrays, structs and pointers. Perhaps I should sit down with someone in person (e.g. @peterschrammel) and talk through it? Ideally Martin's documentation should provide a guide to these classes: |
{ | ||
assert(!is_bottom); | ||
typedef std::function<abstract_object_pointert(const exprt &)> eval_handlert; | ||
std::map<irep_idt, eval_handlert> handlers= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do accept that the code below is correct, and maybe it's even beautiful. But it's probably quite a bit longer than a sequence of 5 if
/else if
statements, isn't it? The overhead of constructing the map might also outweigh the performance gains of doing log-N lookups instead of going through the cascade of ifs.
bool abstract_environmentt::assign( | ||
const exprt &expr, const abstract_object_pointert value) | ||
{ | ||
exprt s = expr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throughout this function (and possibly the entire PR): no space around =
, ==
, !=
{ | ||
// Attempting to assign to something unreasonable | ||
// Your goto-program is broken | ||
throw "die_horribly"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a useful error message. Indeed an assert
may be more appropriate.
} | ||
} | ||
|
||
symbol_exprt symbol_expr=to_symbol_expr(s); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you genuinely need a copy, or would a reference do?
// write_member which will attempt to update the abstract object for the | ||
// relevant member. This modified abstract object is returned and this | ||
// is inserted back into the map | ||
std::map<irep_idt,stacion_functiont> handlers= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my above comment for those handler maps.
\*******************************************************************/ | ||
|
||
abstract_object_pointert array_abstract_objectt::read_index( | ||
const abstract_environmentt &env, const index_exprt &index) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the index
parameter is never used here I don't quite know what this function is supposed to do, even in a more precise implementation. The same holds for write_index
.
|
||
virtual exprt to_constant (void) const; | ||
|
||
virtual void output( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this one have an output
function, when others don't?
pointer_abstract_objectt::pointer_abstract_objectt(const typet &t): | ||
abstract_objectt(t) | ||
{ | ||
assert(t.id()==ID_pointer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment regarding arrays above.
break; | ||
|
||
case RETURN: | ||
throw "return instructions are depreciated"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: typo "depreciated" -> "deprecated" (I'm not actually sure they are, though)
@@ -43,6 +43,7 @@ class optionst; | |||
"(reachable-functions)" \ | |||
"(intervals)(show-intervals)" \ | |||
"(non-null)(show-non-null)" \ | |||
"(variable)" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhere there should be an update to a --help
output.
As I seem to have been commenting on some outdated version based on the link to a smaller diff: @thk123 would you mind scrutinising my recent comment even when marked "outdated", please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving outdated comments to new review.
* | ||
* 1. Having different abstractions for the variable in different | ||
* parts of the program. | ||
* 2. Allowing different domains to write to ambigious locations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ambiguous
bool top; | ||
|
||
// The one exception is merge_state in descendent classes, which needs this | ||
void make_top(void) { top = true; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove space.
} | ||
} | ||
|
||
symbol_exprt symbol_expr=to_symbol_expr(s); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you genuinely need a copy, or would a reference do?
\*******************************************************************/ | ||
|
||
abstract_object_pointert abstract_environmentt::abstract_object_factory( | ||
const typet type, const namespacet &ns, bool top, bool bottom) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is type
passed by value?
\*******************************************************************/ | ||
|
||
abstract_object_pointert abstract_environmentt::abstract_object_factory( | ||
const typet type, const exprt &e, const namespacet &ns) const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is type
passed by value.
t(expr.type()), bottom(false), top(true) | ||
{} | ||
|
||
abstract_objectt::~abstract_objectt() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing comment header. (Though possibly this just shouldn't exist?
const abstract_object_pointert op, bool &out_any_modifications) const | ||
{ | ||
assert(t==op->t); | ||
internal_abstract_object_pointert m= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be a candidate for std::unique_ptr - @reuk may be able to advise.
#include <analyses/variable-sensitivity/abstract_object.h> | ||
|
||
|
||
class abstract_valuet:public abstract_objectt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually this is the idea of a primitive to mirror the array, pointer union and struct abstract objects. Me and @martin-cs discussed this, the conclusion to keep it was based on some of the transform resolution code would end up in here.
array_abstract_objectt::array_abstract_objectt(const typet &t): | ||
abstract_objectt(t) | ||
{ | ||
assert(t.id()==ID_array); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To keep the constructor consistent so can use in the abstract object factory: https://github.com/diffblue/cbmc/pull/708/files#diff-376bc05dbdc3369030faf77361786f1fR100
"(unreachable-instructions)(unreachable-functions)" \ | ||
"(reachable-functions)" \ | ||
"(intervals)(show-intervals)" \ | ||
"(non-null)(show-non-null)" | ||
"(non-null)(show-non-null)" \ | ||
"(variable)" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhere there should be an update to a --help output.
Irritatingly, because they are "outdated", I can't reply inline, sorry. The "outdated" ones that I agree with I have moved to a new review on the latest code. I have left the ones that aren't outdated that I agree with and will do. Below are outdated ones I've not been able to reply to inline:
My view was this made it easier to see which cases were handled. However I can swap it to
This one has been fixed, the linter has no errors. I found one other but I think this is otherwise not an issue any more.
Yup - changed in later commit: https://github.com/diffblue/cbmc/pull/708/files#diff-7260d207dbd5603305c9641edbcd53b5R213
Resolved
I'm not sure I understand what action you want me to take on this?
Unless I am mistaken, don't we need the full header in the cpp file?
Fixed
Fixed: https://github.com/diffblue/cbmc/pull/708/files#diff-487d940b8f9e1c792514613569de94d4R190
Yup addressed in this commit: e293073
I am deeply uncomfortable with this code as it is confusing without a long analysis. The reason for the define and virtual is to do double dispatch on the two types being merged. The logic being it finds the meet of the two classes, allowing for different sensitivities to combine. In practise we only support merging of the same type at type at the moment. An alternative that @martin-cs was opposed to was having some kind of separate data structure not tied to the class hierarchy which allowed looking up merges for two arbitrary abstract objects.
These provide the interface for sensitive abstract objects to read from or write to an array at an index. Perhaps looking at the constant_array_sensitivity_object implementations of these methods will make it clearer: https://github.com/diffblue/cbmc/pull/708/files#diff-5a47f0c5c6b007edfa773ce075ad1863R207
The base class implements the least sensitive output (TOP/BOTTOM) if a derived class provides greater sensitivity (for example, this class provides TOP/one value/BOTTOM) so it overrides output so that it can output this individual value. The array/pointer/struct classes only provide an interface, they do not increase the sensitivity of the object, and so do not need to override the base class. |
[This was about havoc.] I'd suggest to remove
I'd suggest a chat with @reuk to devise a good solution. |
Modify the merge to return shared pointers by replacing returns of this with returns of `shared_from_this`. This simplifies the code for dealing with when need to modify the result in dervied classes as we are no longer responsible for freeing the pointer returned by the super class.
The base merge should always result in either top or bottom (it is up for dervied classes to clear the top flag if they are not top). Correctly use the base class merge when appropriate in merging constant values.
Refactored the merge maps method to use std::intersection as there was a bug with empty maps caught by the unit tests.
Use the CPROVER_assert as behaviour is consistent across platforms.
Removed some unnecessary debug information from the tests
In release builds, the top/bottom status of the enviroment was undefined. This ensures the enviroment is set to top (i.e. an entry point).
Previously each abstract object that implemented merge had to correctly redirect merging top with anything or merging with a bottom. This is now handled directly in the root merge by checking the three cases where the base merge must be used. If this is bottom must still be handled by derived classes as we must create an abstract object of the same type.
At some point this commit needs backdating as the ai_domain interface has changed under our feet.
92a84dc
to
8601e81
Compare
Rebased onto @martin-cs's goto-analyzer-5 branch, this is not doxygenated yet (that will come). I'll update this PR next week with new links to real diffs etc |
79070d2
to
8601e81
Compare
Resolved by #5472 |
Beginning to integrate the "Variable Sensitivity Domain". This is a big change that was developed over a few months. I am prepared to split this into separate PRs to make it easier to review, but thought I'd start with everything so a) can see where it is going, b) if splitting is not required, that saves me some work!
This PR depends on the work on goto-analyze #472
This PR has unit tests so is dependent on #874 ✅
This PR includes the tests which Owen was working on, but these can go in as an independent PR #566
✅
Real Diff: https://github.com/diffblue/cbmc/pull/708/files/a8d47ce..1e23402 (excluding changes in the catch commit)
Overview
There is quite a lot of infrastructure that one needs to implement a non-relational domain
correctly. For example, what you do about fields, setting things to top when you write to an unknown pointer, element-wise merging, etc. The vast majority of that is independent of what per-variable abstraction you are using. The variable sensitivity domain is an implementation of this infrastructure.
We have implemented constants and will implement intervals, in addition to the existing domains and goto-analyze will probably be using these in preference to the older ones.
There is some documentation in the header of
variable_sensitivity_domain.h
among other places.Known limitations
Controversial elements
Splitting this up
As mentioned above I could split this into the following PRs (with some work as there is some intermingling of changes):
Tests (#566) should go in first if going in separately.
Core framework (just the base classes giving only the top/bottom abstraction for all types) and has all the tests turned off.
Then all of the following PRs would be independent of each other and would just enable a new abstract object:
Before I do this, it probably still makes sense to review in this manner, i.e. review the following files first before moving on to the abstract objects: