-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't phi-merge uninitialised objects #2168
Don't phi-merge uninitialised objects #2168
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm struggling to understand why this proposal is the adequate solution for the problem. From #2121 it seems the problem is one of the points-to analysis not being sufficiently smart. I'll comment more over there. (Edit: That was me being dense.)
With this PR the commit message doesn't have information that would enable a future reader of this commit to understand what's going on. Please make a habit of writing good commit messages.
Edit 2: We also need tests outside Java.
@@ -174,7 +174,7 @@ ($$$$$$$$$) | |||
binmode $fh; | |||
my $whole_file = <$fh>; | |||
$whole_file =~ s/\r\n/\n/g; | |||
my $is_match = $whole_file =~ /$result/; | |||
my $is_match = $whole_file =~ /$result/m; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, but could it please be put in a separate commit? The explanation in the PR is absolutely necessary to understand this, but will never make it to the commit history at this stage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for factoring it out into a separate commit. Could you please also add the text from the PR to body of the commit message? It's really informative, and should be preserved!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry, fixed. Still getting fully used to git.
71c61fe
to
f73be05
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, some small changes
Additional tests that would be good (in each case for both C and Java): check that locals, static locals (namespaced globals), normal globals and dynamic allocations all behave as expected.
@@ -0,0 +1,30 @@ | |||
public class Main { | |||
|
|||
public int phiMerge(Boolean trigger) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sp s/propogation/propagation (in filename and elsewhere)
Main.class | ||
--function Main.phiMerge --show-vcc | ||
activate-multi-line-match | ||
^.*dynamic_object2#2 == \{ \.@class_identifier="java::Main\$Temp",\s*\.@lock=false \}$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably check for the other dynamic object too (the instance of Temp
and the instance of Ephemeral
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Decided to change the logic to make it more robust, it now just makes sure there's no gen zero objects in guards, doesn't matter where they are.
@@ -429,6 +429,14 @@ void goto_symext::phi_function( | |||
rhs=goto_state_rhs; | |||
else if(goto_state.guard.is_false()) | |||
rhs=dest_state_rhs; | |||
else if(goto_state.level2_current_count(l1_identifier) == 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comments briefly explaining why these checks are here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment block at the top of the if.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
@@ -184,6 +184,12 @@ void goto_symext::symex_allocate( | |||
else | |||
throw "failed to zero initialize dynamic object"; | |||
} | |||
else | |||
{ | |||
exprt nondet = build_symex_nondet(object_type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point of interest: you've copied the case for x = NONDET(int)
rather than the case from code_declt("x", java_int_type())
for example. Check how their behaviour differs if at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't see any major deviations - the code isn't identical but the intentions seem the same and most of the code in decl is run in assign at some point too. One thing to note is that assign propagates the nondet as a constant, while the decl code flat-out disables any propagation, would that be an issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll only disable it for the time being, but it's interesting. I'm not quite well informed enough to give an authoritative answer -- I suggest running this by @peterschrammel and post his answer here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that this is what #35 is about, and you're doing parts of that here, even though the full value only unfolds once you break down nondet symbols into their individual fields.
53f8d95
to
1cce880
Compare
Added a new set of tests and made some minor revisions to the code and commits in regards to comments (sans talking to peter). I'm not well versed in C so if there's a potential condition I've missed just say. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As expressed in a few places by reviewers: more comments and documentation (commit messages) needed, but otherwise looks good to me.
@@ -174,7 +174,7 @@ ($$$$$$$$$) | |||
binmode $fh; | |||
my $whole_file = <$fh>; | |||
$whole_file =~ s/\r\n/\n/g; | |||
my $is_match = $whole_file =~ /$result/; | |||
my $is_match = $whole_file =~ /$result/m; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for factoring it out into a separate commit. Could you please also add the text from the PR to body of the commit message? It's really informative, and should be preserved!
@@ -184,6 +184,12 @@ void goto_symext::symex_allocate( | |||
else | |||
throw "failed to zero initialize dynamic object"; | |||
} | |||
else | |||
{ | |||
exprt nondet = build_symex_nondet(object_type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd argue that this is what #35 is about, and you're doing parts of that here, even though the full value only unfolds once you break down nondet symbols into their individual fields.
d03e11f
to
2ba33cf
Compare
return obj.val; | ||
} | ||
|
||
private Ephemeral local; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't really a local -- in Java we can't really have a local object at all (but we can have a local reference or pointer). Really the one labelled dynamic
is using a local variable to store the pointer, and this one is using a member field to store it.
return local.val; | ||
} | ||
|
||
private static Ephemeral staticLocal; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, staticField
would be more accurate
^SIGNAL=0$ | ||
-- | ||
^.*: (dynamic_object|new_tmp[0-9]+(@[0-9]+)?)[0-9]+#0\) | ||
^.*\? (dynamic_object|new_tmp[0-9]+(@[0-9]+)?)[0-9]+#0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In test.desc files a second --
delimits comments. You should add one and then comment on what these regexes are looking for.
Format example:
CORE
Main.class
--other --arguments
Must find this
Also must find this
--
Musn't find this
Nor this
--
This is a comment
This is another comment
assert(*obj == 20); | ||
} | ||
|
||
int *staticLocal; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In C we really can have local objects. For all the cases other than dynamic
, don't use malloc, instead declare the object on the stack, for example int local;
Static locals are possible in C, and are effectively globals that are only accessible in that function. For example,
int f() {
static int staticvar = 100;
return staticvar++;
}
void main() {
assert(f() == 100); // Passes
assert(f() == 101); // Passes
}
The final missing case for C is a "real" global. For example:
int global;
int main() {
global = 1234;
...
}
} | ||
else | ||
{ | ||
*obj = 20; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also add a test where one branch is missing and the merge must happen.
bcc5211
to
c1a4f70
Compare
void dynamicAllocationUninitialized(int trigger) | ||
{ | ||
int *obj; | ||
obj = (int*)malloc(sizeof(int)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For symmetry with the Java case this should only be allocated on one branch (i.e. we should allocate within the if
/ else
)
This is looking much improved, just commented with one more nitpick, let me know when this is ready for re-review |
2df875b
to
7ba294d
Compare
This change means that if a generation zero symex variable is seen it can be assumed to not have been allocated at any point and still have its default values. We use this knowledge to then not add a guard on a phi merge that has a gen zero on its lhs or rhs, instead just simply assigning the other side directly.
This is a change to the multi-line match option to enable traditional regex multi-line matching. The m flag causes ^ and $ to actually match correctly on new line/carriage returns, where before the match was actually on the entire file so ^ was just beginning of file $ was the end. This also fixes a few tests' failure condition where this subtle difference wasn't noticed and '^warning: ignoring' is used.
7ba294d
to
6faf376
Compare
Looks great, merge once tests are passing I say |
Made all requested changes, I'll fix any issues that the CI flags up but right now it looks like some weird timeout issues, so I'll just re-run it until it gives me something useful or passes. |
https://www.traviscistatus.com/ says there are problems with macOS, so it might take a little longer... (all other jobs have passed here). |
da61186 Merge pull request diffblue#2182 from diffblue/trace-debug 5acf313 Merge pull request diffblue#2176 from mgudemann/fix/glucose_build_g++ e82dca0 formatting of debug output of traces f891f37 Merge pull request diffblue#2168 from JohnDumbell/bugfix/phi_merge_uninitialized 6faf376 Enable 'm' flag on regex for multi-line tests 646cf29 Add nondet assignment to non-zero'd allocations in symex 213db5f Remove trailing `;` from namespace closing bracket git-subtree-dir: cbmc git-subtree-split: da61186
This is the work mentioned in the last comment on #2121, making it so that a generation 0 of a symex variable means it's never been initialised in both Java and C. The approach is on an allocate to assign a nondet to it if it isn't zero-init'd to begin with, and then if we see a gen zero then to not consider it during conditionals on phi merges.
Along with these is a small change to test.pl's multi-line match option to enable traditional regex multi-line matching. The m flag causes ^ and $ to actually match correctly on new line/carriage returns, where before the match was actually on the entire file so ^ was just beginning of file $ was the end. This also fixes a few tests' failure condition where this subtle difference wasn't noticed and '^warning: ignoring' is used.