Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn user when unsound flags are enabled. #6479

Merged
merged 6 commits into from
Mar 1, 2023

Conversation

NlightNFotis
Copy link
Contributor

@NlightNFotis NlightNFotis commented Nov 25, 2021

Hello,

This is a draft PR that enables warning a user when --reachability-slice and --full-slice
are used in cbmc or goto-instrument.

It also adds a small doc for explaining the rationale for this, and how the warnings look.

For more discussion on this, please have a look at #6480

@codecov
Copy link

codecov bot commented Nov 25, 2021

Codecov Report

Patch coverage: 100.00% and project coverage change: -0.01 ⚠️

Comparison is base (796fb38) 78.50% compared to head (2f723d5) 78.50%.

❗ Current head 2f723d5 differs from pull request most recent head fde9ac5. Consider uploading reports for the commit fde9ac5 to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6479      +/-   ##
===========================================
- Coverage    78.50%   78.50%   -0.01%     
===========================================
  Files         1670     1670              
  Lines       191679   191677       -2     
===========================================
- Hits        150479   150468      -11     
- Misses       41200    41209       +9     
Impacted Files Coverage Δ
src/cbmc/cbmc_parse_options.cpp 80.29% <100.00%> (+0.14%) ⬆️
.../goto-instrument/goto_instrument_parse_options.cpp 71.73% <100.00%> (+0.09%) ⬆️
src/util/type.h 98.68% <0.00%> (-1.32%) ⬇️
src/ansi-c/c_typecheck_expr.cpp 76.14% <0.00%> (-0.58%) ⬇️
src/util/pointer_expr.h 96.72% <0.00%> (-0.03%) ⬇️
src/util/c_types.h 100.00% <0.00%> (ø)
src/util/std_types.h 95.82% <0.00%> (ø)
src/cpp/cpp_template_type.h 52.94% <0.00%> (+13.81%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@feliperodri feliperodri added the aws Bugs or features of importance to AWS CBMC users label Nov 29, 2021
@NlightNFotis NlightNFotis marked this pull request as ready for review February 9, 2022 14:07
Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for kicking off work to annotate options and document what's going on. For "reachability-slice," however, I'd rather see us merge the fix.

@@ -952,6 +952,8 @@ bool cbmc_parse_optionst::process_goto_program(

if(options.get_bool_option("reachability-slice"))
{
log.warning() << "WARNING: Unsound option --reachability-slice"
<< messaget::eom;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option isn't fundamentally unsound, it's just that #6505 needs to be merged to fix a bug.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above PR has been merged, so I don't think this applies any longer.

Copy link
Contributor

@TGWDB TGWDB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per the discussion we had in #6480 I thought the conclusion was to be explicit in how an option can be "unsound" as either Spurious counter example to an assertion or Wrong proof to an assertion. Can we update the messages and documentation here to say what can go wrong.

@NlightNFotis NlightNFotis force-pushed the warn_unsound_flags branch 2 times, most recently from 93aa9cd to 3c16b8d Compare March 7, 2022 10:10
@NlightNFotis
Copy link
Contributor Author

Hi @tautschnig, some updates with regard to this:

  1. I have removed the --reachability-slice mark as requested.
  2. I have explored the behaviour here a bit, and can report the following:

If we enable any of the slice flags on a global level for a regressions test-suite run, for example --full-slice for all tests under regression/cbmc/ we do get a number of failures that appear to be unexpected (to me at least).

The failures appear to be flipping of verification results (from success to failure), and tend to refer to values not present in the trace. I have noticed that what happens in these cases is that we are asserting against values that appear to be assigned the result of a function call, but the slice options appear to eliminate the function body.

This is what happens in the case of a --full-slice, and which I've done some digging into. Test failures for similar reasons appear to be the result of other flags activated on a global basis, (for instance, --reachability-slice or --slice-formula) but I'm not sure if it's the exact same behaviour driving this.

I have done a number of quick fix-like experiments, as well as a few other structural changes to the full-slice but couldn't get anything to work properly (by this I mean passing all the tests), which signals a lack of understanding of the slicing internals from my end. But this appears to be genuine wrong behaviour (I expect any slicer to effectively preserve at least the direct path to an assignment).

@NlightNFotis
Copy link
Contributor Author

I have left a more detailed comment outlining the current issue in #260 .

@NlightNFotis
Copy link
Contributor Author

Hi @tautschnig, I have changed this to now also mark --reachability-slice as unsound, based on comments made during one of our regular catchup meetings.

Can you please confirm this is the case?

Could you also have another look at this at your earliest convenience, given that this is blocked by you?

@NlightNFotis
Copy link
Contributor Author

Hi @tautschnig, ping here - is this possible to review this so that we can get it merged?

@tautschnig
Copy link
Collaborator

Hi @tautschnig, I have changed this to now also mark --reachability-slice as unsound, based on comments made during one of our regular catchup meetings.

Can you please confirm this is the case?

Could you also have another look at this at your earliest convenience, given that this is blocked by you?

With apologies for the late response: is it still the case that --reachability-slice is knowingly unsound/producing wrong results? I am aware that --full-slice has issues and marking that one "unsound" appears fine.

@feliperodri feliperodri added the soundness Soundness bug? Review and add "aws" if it is, or remove "soundness" if it isn't. label Feb 13, 2023
Copy link
Collaborator

@tautschnig tautschnig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can merge this as is, but I am wondering whether we might want to speak of "Unexpectedly" unsound options? Examples of transformations that incur unsoundness are --partial-loops (may yield spurious counterexamples) or --aggressive-slicer (may yield both spurious counterexamples as well as wrong proof). Both of these, AFAIK, work exactly as their documentation suggests, but do yield those unsound results.

Copy link
Collaborator

@jimgrundy jimgrundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hang on... there are options beyond the ones documented here that are known to cause unsoundness? Surely those must be documented here. It's almost worse than the present situation to go ahead as is, because if I now believe the tool will warn me when I ask for something unsound and I don't get a warning then I'll be even more convinced of the validity of the results I get.

@tautschnig
Copy link
Collaborator

Hang on... there are options beyond the ones documented here that are known to cause unsoundness? Surely those must be documented here. It's almost worse than the present situation to go ahead as is, because if I now believe the tool will warn me when I ask for something unsound and I don't get a warning then I'll be even more convinced of the validity of the results I get.

@jimgrundy I have now added a further commit as discussed out of band.

Copy link
Collaborator

@jimgrundy jimgrundy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like these changes even more, but, since I'm asking

  • It feels now like there should be a heading and short section on options that are unsound by design (which probably isn't the most catchy name for them). If we could give an example or two of these + state that these options will give a warning when used.
    Do we know how many such options there are? If we don't know or there are lots then we could defer adding the warning until we tackle soundness more completely, but if there if there are just a couple could we sneak the warnings in now?

Use heading levels that match the nesting level within the overall
index.
We only ended up with reasonable output as the next line was a newline.
@tautschnig
Copy link
Collaborator

I like these changes even more, but, since I'm asking

* It feels now like there should be a heading and _short_ section on options that are unsound by design (which probably isn't the most catchy name for them). If we could give an example or two of these + state that these options will give a warning when used.

Done.

  Do we know how many such options there are? If we don't know or there are lots then we could defer adding the warning until we tackle soundness more completely, but if there if there are just a couple could we sneak the warnings in now?

I believe I have covered all of them for CBMC. JBMC has a more elaborate environment generation facility, which might have its own share of risks. Perhaps @peterschrammel can comment on what I've covered so far?

tautschnig and others added 3 commits February 28, 2023 13:50
Neither CBMC nor JBMC actually interpret this option, despite it being
advertised in help output (and man pages). Conversely, goto-instrument
has support for `--reachability-slice-fb`, but it wasn't available on
the command line.
Also warn about options that aren't unsound by design, but are
experimental in nature.
Some use of CBMC or JBMC can yield verification results that must not be
taken at face value.

Co-authored-by: Michael Tautschnig <[email protected]>
We concluded that *unsound* in this context is a proxy for the following two behaviors
that we want to avoid:

* **A spurious counter example to an assertion**, which means that the tool may report
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

counterexample

@peterschrammel peterschrammel removed their assignment Mar 1, 2023
@NlightNFotis NlightNFotis merged commit 557f4bf into diffblue:develop Mar 1, 2023
@NlightNFotis NlightNFotis deleted the warn_unsound_flags branch March 1, 2023 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aws Bugs or features of importance to AWS CBMC users soundness Soundness bug? Review and add "aws" if it is, or remove "soundness" if it isn't.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants