Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't phi-merge uninitialised objects #2168

Merged

Conversation

JohnDumbell
Copy link
Contributor

This is the work mentioned in the last comment on #2121, making it so that a generation 0 of a symex variable means it's never been initialised in both Java and C. The approach is on an allocate to assign a nondet to it if it isn't zero-init'd to begin with, and then if we see a gen zero then to not consider it during conditionals on phi merges.

Along with these is a small change to test.pl's multi-line match option to enable traditional regex multi-line matching. The m flag causes ^ and $ to actually match correctly on new line/carriage returns, where before the match was actually on the entire file so ^ was just beginning of file $ was the end. This also fixes a few tests' failure condition where this subtle difference wasn't noticed and '^warning: ignoring' is used.

Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling to understand why this proposal is the adequate solution for the problem. From #2121 it seems the problem is one of the points-to analysis not being sufficiently smart. I'll comment more over there. (Edit: That was me being dense.)

With this PR the commit message doesn't have information that would enable a future reader of this commit to understand what's going on. Please make a habit of writing good commit messages.

Edit 2: We also need tests outside Java.

@@ -174,7 +174,7 @@ ($$$$$$$$$)
binmode $fh;
my $whole_file = <$fh>;
$whole_file =~ s/\r\n/\n/g;
my $is_match = $whole_file =~ /$result/;
my $is_match = $whole_file =~ /$result/m;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, but could it please be put in a separate commit? The explanation in the PR is absolutely necessary to understand this, but will never make it to the commit history at this stage.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for factoring it out into a separate commit. Could you please also add the text from the PR to body of the commit message? It's really informative, and should be preserved!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry, fixed. Still getting fully used to git.

@JohnDumbell JohnDumbell force-pushed the bugfix/phi_merge_uninitialized branch from 71c61fe to f73be05 Compare May 9, 2018 10:37
@JohnDumbell JohnDumbell changed the title [SEC-370] Don't phi-merge uninitialised objects Don't phi-merge uninitialised objects May 9, 2018
Copy link
Contributor

@smowton smowton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, some small changes

Additional tests that would be good (in each case for both C and Java): check that locals, static locals (namespaced globals), normal globals and dynamic allocations all behave as expected.

@@ -0,0 +1,30 @@
public class Main {

public int phiMerge(Boolean trigger) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sp s/propogation/propagation (in filename and elsewhere)

Main.class
--function Main.phiMerge --show-vcc
activate-multi-line-match
^.*dynamic_object2#2 == \{ \.@class_identifier="java::Main\$Temp",\s*\.@lock=false \}$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably check for the other dynamic object too (the instance of Temp and the instance of Ephemeral)

Copy link
Contributor Author

@JohnDumbell JohnDumbell May 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided to change the logic to make it more robust, it now just makes sure there's no gen zero objects in guards, doesn't matter where they are.

@@ -429,6 +429,14 @@ void goto_symext::phi_function(
rhs=goto_state_rhs;
else if(goto_state.guard.is_false())
rhs=dest_state_rhs;
else if(goto_state.level2_current_count(l1_identifier) == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments briefly explaining why these checks are here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment block at the top of the if.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@@ -184,6 +184,12 @@ void goto_symext::symex_allocate(
else
throw "failed to zero initialize dynamic object";
}
else
{
exprt nondet = build_symex_nondet(object_type);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point of interest: you've copied the case for x = NONDET(int) rather than the case from code_declt("x", java_int_type()) for example. Check how their behaviour differs if at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see any major deviations - the code isn't identical but the intentions seem the same and most of the code in decl is run in assign at some point too. One thing to note is that assign propagates the nondet as a constant, while the decl code flat-out disables any propagation, would that be an issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll only disable it for the time being, but it's interesting. I'm not quite well informed enough to give an authoritative answer -- I suggest running this by @peterschrammel and post his answer here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that this is what #35 is about, and you're doing parts of that here, even though the full value only unfolds once you break down nondet symbols into their individual fields.

@JohnDumbell JohnDumbell force-pushed the bugfix/phi_merge_uninitialized branch 2 times, most recently from 53f8d95 to 1cce880 Compare May 9, 2018 17:40
@JohnDumbell
Copy link
Contributor Author

Added a new set of tests and made some minor revisions to the code and commits in regards to comments (sans talking to peter). I'm not well versed in C so if there's a potential condition I've missed just say.

Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As expressed in a few places by reviewers: more comments and documentation (commit messages) needed, but otherwise looks good to me.

@@ -174,7 +174,7 @@ ($$$$$$$$$)
binmode $fh;
my $whole_file = <$fh>;
$whole_file =~ s/\r\n/\n/g;
my $is_match = $whole_file =~ /$result/;
my $is_match = $whole_file =~ /$result/m;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for factoring it out into a separate commit. Could you please also add the text from the PR to body of the commit message? It's really informative, and should be preserved!

@@ -184,6 +184,12 @@ void goto_symext::symex_allocate(
else
throw "failed to zero initialize dynamic object";
}
else
{
exprt nondet = build_symex_nondet(object_type);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that this is what #35 is about, and you're doing parts of that here, even though the full value only unfolds once you break down nondet symbols into their individual fields.

@JohnDumbell JohnDumbell force-pushed the bugfix/phi_merge_uninitialized branch 4 times, most recently from d03e11f to 2ba33cf Compare May 10, 2018 10:16
return obj.val;
}

private Ephemeral local;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really a local -- in Java we can't really have a local object at all (but we can have a local reference or pointer). Really the one labelled dynamic is using a local variable to store the pointer, and this one is using a member field to store it.

return local.val;
}

private static Ephemeral staticLocal;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, staticField would be more accurate

^SIGNAL=0$
--
^.*: (dynamic_object|new_tmp[0-9]+(@[0-9]+)?)[0-9]+#0\)
^.*\? (dynamic_object|new_tmp[0-9]+(@[0-9]+)?)[0-9]+#0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In test.desc files a second -- delimits comments. You should add one and then comment on what these regexes are looking for.

Format example:

CORE
Main.class
--other --arguments
Must find this
Also must find this
--
Musn't find this
Nor this
--
This is a comment
This is another comment

assert(*obj == 20);
}

int *staticLocal;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In C we really can have local objects. For all the cases other than dynamic, don't use malloc, instead declare the object on the stack, for example int local;

Static locals are possible in C, and are effectively globals that are only accessible in that function. For example,

int f() {
  static int staticvar = 100;
  return staticvar++;
}

void main() {
  assert(f() == 100); // Passes
  assert(f() == 101); // Passes
}

The final missing case for C is a "real" global. For example:

int global;

int main() {
  global = 1234;
  ...
}

}
else
{
*obj = 20;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also add a test where one branch is missing and the merge must happen.

@JohnDumbell JohnDumbell force-pushed the bugfix/phi_merge_uninitialized branch 2 times, most recently from bcc5211 to c1a4f70 Compare May 10, 2018 15:02
void dynamicAllocationUninitialized(int trigger)
{
int *obj;
obj = (int*)malloc(sizeof(int));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For symmetry with the Java case this should only be allocated on one branch (i.e. we should allocate within the if / else)

@smowton
Copy link
Contributor

smowton commented May 11, 2018

This is looking much improved, just commented with one more nitpick, let me know when this is ready for re-review

@JohnDumbell JohnDumbell force-pushed the bugfix/phi_merge_uninitialized branch 2 times, most recently from 2df875b to 7ba294d Compare May 11, 2018 14:28
This change means that if a generation zero symex variable is seen it can be assumed to not have been allocated at any point and still have its default values. We use this knowledge to then not add a guard on a phi merge that has a gen zero on its lhs or rhs, instead just simply assigning the other side directly.
This is a change to the multi-line match option to enable traditional regex multi-line matching. The m flag causes ^ and $ to actually match correctly on new line/carriage returns, where before the match was actually on the entire file so ^ was just beginning of file $ was the end. This also fixes a few tests' failure condition where this subtle difference wasn't noticed and '^warning: ignoring' is used.
@JohnDumbell JohnDumbell force-pushed the bugfix/phi_merge_uninitialized branch from 7ba294d to 6faf376 Compare May 11, 2018 15:11
@smowton
Copy link
Contributor

smowton commented May 11, 2018

Looks great, merge once tests are passing I say

@JohnDumbell
Copy link
Contributor Author

Made all requested changes, I'll fix any issues that the CI flags up but right now it looks like some weird timeout issues, so I'll just re-run it until it gives me something useful or passes.

@tautschnig
Copy link
Collaborator

https://www.traviscistatus.com/ says there are problems with macOS, so it might take a little longer... (all other jobs have passed here).

@tautschnig tautschnig merged commit f891f37 into diffblue:develop May 12, 2018
NathanJPhillips pushed a commit to NathanJPhillips/cbmc that referenced this pull request Aug 22, 2018
da61186 Merge pull request diffblue#2182 from diffblue/trace-debug
5acf313 Merge pull request diffblue#2176 from mgudemann/fix/glucose_build_g++
e82dca0 formatting of debug output of traces
f891f37 Merge pull request diffblue#2168 from JohnDumbell/bugfix/phi_merge_uninitialized
6faf376 Enable 'm' flag on regex for multi-line tests
646cf29 Add nondet assignment to non-zero'd allocations in symex
213db5f Remove trailing `;` from namespace closing bracket

git-subtree-dir: cbmc
git-subtree-split: da61186
@JohnDumbell JohnDumbell deleted the bugfix/phi_merge_uninitialized branch October 23, 2018 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants