Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Goto programs validation #3191

Merged
merged 13 commits into from
Mar 4, 2019
Merged

Conversation

sonodtt
Copy link
Contributor

@sonodtt sonodtt commented Oct 17, 2018

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

2019-03-01
NB Will squash prior to merge.
Fairly major changes after review by Kroening.

  1. check for entry-point now disabled
  2. checks for presence of sourcelocation and code.sourcelocation now disabled.
  3. goto_modelt::validate now has default values for validation_mode and goto_model_validation_options

2019-02-27
There will be a follow up PR to this to re-enable the checks for an entry-point ("validate_goto_modelt::entry_point_exists()")
This is temporarily disabled as some (<6) regression tests have no entry point, and in some
cases this is intentional (e.g. testing json-ui warnings)


2018-12-21
The first two commits are not new. Changes to reviews have been put into separate commits - but the last commit is the most consequential as it deals with @daniel.poetzl’s request for a change of location and I break the source location tests out into a new unit test file.

• Relevant to this PR (the bigger picture) are also the remarks on
#751 & more recently on #3188
These are not addressed directly here as such, but I am flagging these as context/reminders for reviewers/ going forward.
In particular:
#3188 (comment)
There are several (hopefully nested but maybe not) ideas of what a "valid" goto-program is that are implicitly assumed by different parts of the code. Part of the point of all of this validity checking was to map out, clarify and correct these. Parts of the code assume that there are no RETURN instructions, parts assume that there are and things like the last instruction (apart from END_INSTRUCTION) should be return. I was asking how your validity checks will cope with this. It seems there are two options:

  • The caller specifies which kind of goto-program they want to check for.
  • The check reports some auxiliary flags and the caller then checks these separately.

@peterschrammel added:
There are several kinds of Goto program that (should) form a subset chain.
The transformations in process_Goto_program() go through this chain in several feature reduction passes.
Certain analyses may require a specific kind of Goto program.
So, 'valid' depends on the stage of processing. We should make this explicit.

const irep_idt &identifier =
to_symbol_expr(function_call.function()).get_identifier();

DATA_CHECK(function_map.find(identifier) != function_map.end(), "");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fill the comment (also below).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

DATA_CHECK(
fun.second.body.instructions.back().is_end_function(),
"last instruction should be of end function type");
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the semicolon necessary?

Copy link
Contributor Author

@sonodtt sonodtt Oct 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

erm..nope! Well spotted.

@sonodtt sonodtt force-pushed the goto_programs-validation branch from f3410eb to 81732d2 Compare October 17, 2018 11:54
@sonodtt sonodtt requested a review from danpoe October 17, 2018 13:09
@sonodtt sonodtt force-pushed the goto_programs-validation branch from 81732d2 to 1de2516 Compare October 17, 2018 13:46
Copy link
Collaborator

@martin-cs martin-cs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right direction but not finished yet.

class test_for_function_addresst : public const_expr_visitort
{
private:
const irep_idt match_id{ID_address_of};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??? Why not just use ID_address_of directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if(!test_for_function_address.identifiers.empty())
{
for(auto &identifier : test_for_function_address.identifiers)
DATA_CHECK(function_map.find(identifier) != function_map.end(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work for pre-linking goto-programs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-cs
That got us thinking... and I rewrote the following check (which now has the comments below)
/// Check that goto-programs that have a body end with an END_FUNCTION
/// instruction.
///
/// NB functions that are only declared, and not used will not appear in the
/// function map. Functions that are declared only and used to e.g.
/// initialise a function pointer will have no body.
void check_last_instruction();

@karkhaz
Copy link
Collaborator

karkhaz commented Nov 7, 2018

@tautschnig suggested that I chime in here: instructionts have a source_location and also a code->source_location(), but these are often inconsistent (sometimes one of them is nil). I'm wondering whether it should be an invariant that those two fields are always identical. I've been using the below code in various places, but it would be nice if there were a consistency check for this sort of thing:

const source_locationt &sloc_of(const goto_programt::const_targett &ins) const
  {
    return ins->code.source_location().is_not_nil()
             ? ins->code.source_location()
             : ins->source_location;
  }

  source_locationt &sloc_of(const goto_programt::const_targett &ins)
  {
    return const_cast<source_locationt &>(
      static_cast<const dispatch_loop_detectort &>(*this).sloc_of(ins));
  }

  const source_locationt &sloc_of(const goto_programt::instructiont &ins) const
  {
    return ins.code.source_location().is_not_nil() ? ins.code.source_location()
                                                   : ins.source_location;
  }

  source_locationt &sloc_of(const goto_programt::instructiont &ins)
  {
    return const_cast<source_locationt &>(
      static_cast<const dispatch_loop_detectort &>(*this).sloc_of(ins));
  }

tautschnig added a commit that referenced this pull request Nov 9, 2018
Skeleton for well-formedness checking of goto models [blocks: #3118, #3147, #3188, #3191, #3187, #3193]
@sonodtt sonodtt force-pushed the goto_programs-validation branch 4 times, most recently from 9ccd64d to 1cb3da1 Compare November 21, 2018 10:04
@sonodtt
Copy link
Contributor Author

sonodtt commented Nov 21, 2018

@karkhaz Many thanks for bringing this up. A check is now in place - not currently active as the code needs patching to ensure both locations are set. I'll raise an issue for that.

@sonodtt sonodtt force-pushed the goto_programs-validation branch 4 times, most recently from f56e3c5 to 91ace58 Compare November 21, 2018 14:56
@sonodtt sonodtt assigned martin-cs and unassigned martin-cs Feb 27, 2019
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 1fb1129).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/102487466

goto_model.validate(
validation_modet::INVARIANT,
goto_model_validation_optionst{
goto_model_validation_optionst::set_optionst::all_true});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given how often that all_true option is used, simply make that the default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for INVARIANT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also for INVARIANT.

vm,
code.source_location().is_not_nil(),
"each instruction code field, must have non nil source location");

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the comma

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that not every instruction has a code member, so this will fail.

Copy link
Contributor Author

@sonodtt sonodtt Mar 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action taken (after team chat). In commit message (reproduced here in case it helps):

Removed previously disabled checks :
Not every instruction has a code member - so removed checks that both
instruction sourcelocation and code sourcelocation are set and identical
Remove also remaining check that every instruction have a non-nil
sourcelocation as this would have to be optional (if enabled fails
many regression tests). This also simplifies considerably the overall
validation pass (removes much passing around of the options structure).

/// - lhs of every \ref code_function_callt instruction is nil
/// - all return types are void (of both calls and functions themselves)
void check_returns_removed();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note #4266 on the return type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed in validate_goto_model.cpp

    DATA_CHECK(
      vm,
      goto_function.type.return_type().id() == ID_empty,
      "functions must have empty return type");

{
const code_function_callt &function_call =
to_code_function_call(instr.code);
DATA_CHECK(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use instr.get_function_call().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if(instr.is_function_call())
{
const auto &function_call = to_code_function_call(instr.code);
DATA_CHECK(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// check functions that are called
if(instr.is_function_call())
{
const auto &function_call = to_code_function_call(instr.code);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

auto &instructions = goto_function.body.instructions;

goto_programt::instructiont instruction;
instruction.make_end_function();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instructiont::make_end_function is deprecated -- use goto_programt::make_end_function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

goto_programt::instructiont instruction;
instruction.make_end_function();
instructions.push_back(instruction);
instructions.back().source_location.id("id_any_valid_id");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source_locationt ireps don't have an ID.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// required as goto_function.validate checks (if a function has a body) that
// the last instruction of a function body marks the function's end.
goto_programt::instructiont end_function_instruction;
end_function_instruction.make_end_function();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Sonny Martin added 4 commits March 1, 2019 14:05
Removed previously disabled checks:
Not every instruction has a code member - so removed checks that both 
instruction sourcelocation and code sourcelocation are set and identical
Remove also remaining check that every instruction have a non-nil 
sourcelocation as this would have to be optional (if enabled fails 
many regression tests). This also simplifies considerably the overall 
validation pass (removes much passing around of the options structure).

Removes check on function return type - this will be preserved (diffblue#4266)

goto_model::validate now has default parameters.

Minor fixes.
Removed previously disabled checks:
Not every instruction has a code member - so removed checks that both 
instruction sourcelocation and code sourcelocation are set and identical
Remove also remaining check that every instruction have a non-nil 
sourcelocation as this would have to be optional (if enabled fails 
many regression tests). This also simplifies considerably the overall 
validation pass (removes much passing around of the options structure).

Removes check on function return type - this will be preserved (diffblue#4266)

goto_model::validate now has default parameters.

Minor fixes.
Although additional checks may hold, they have their own unit
tests and it is these tests that should fail when these checks fail.
@sonodtt sonodtt force-pushed the goto_programs-validation branch from 73e94b4 to 6ba2cf4 Compare March 1, 2019 14:50
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 6ba2cf4).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/102836757

Copy link
Collaborator

@martin-cs martin-cs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a start this is reasonable but there are many invariants it doesn't check (whether variables follow the DECL - use - DEAD cycle, scoping of variables, whether exprt's are well formed, etc.). I presume these will be in follow-up patches.

bool function_pointer_calls_removed = true;
bool check_returns_removed = true;
bool check_called_functions = true;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is intended to be comprehensive right now, it should also check for vector and complex. If this is intended to be a general framework then it's not super urgent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its a general framework right now - everyone is invited to fill in the additional checks :-)

@tautschnig tautschnig merged commit 2bb84ad into diffblue:develop Mar 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.