minimum changes needed in order to use FxHashSet and FxHashMap in `prusti-viper` #817

Pointerbender · 2022-01-02T21:10:29Z

This PR increases the use of FxHashSet and FxHashMap in the prusti-viper module (and a little in vir, vir_gen and prusti_interface to make it compile).

The purpose of this PR is two-fold:

It helps the effort for issue Use FxHashSet/FxHashMap #711.
It removes all the non-determinism in generating Viper programs.

In several places, the Viper programs were not generated deterministically, specifically at:

Re-numbering the loan indices obtained from Polonius.
Orderings of the folding/unfolding statements for permissions to fields of ADTs.
Orderings of statements for expiring borrows.

These sources of non-determinism are also an issue for PR #799, which relies on computing the (deterministic) hashes of Viper programs in order to cache the verification results. This causes the cache to be under-utilized due to different program hashes for the same program between compilations. By default, std::collections::{HashMap, HashSet} uses a randomized seed for every run to avoid DDOS attacks. FxHashMap and FxHashSet are also randomly sorted, but these are not randomly seeded every run, thus the collection orders are consistent between runs.

…per`

fpoli · 2022-01-04T09:16:43Z

prusti-interface/src/environment/loops.rs

@@ -10,7 +10,7 @@ use crate::environment::procedure::BasicBlockIndex;
 use rustc_middle::mir;
 use rustc_middle::mir::visit::Visitor;
 use rustc_data_structures::graph::dominators::Dominators;
-use std::collections::{HashMap, HashSet};
+use rustc_hash::{FxHashMap as HashMap, FxHashSet as HashSet};


I'd prefer to use FxHashMap and FxHashSet in the code; the diff gets bigger but the code would be more readable IMO (e.g. it's not necessary to jump to the top of the file to check if HashMap is really HashMap or FxHashMap).

Is there any reason to use rustc_hash::{FxHashMap, FxHashSet rather than rustc_data_structures::fx::{FxHashMap, FxHashSet}?

Thanks for the review!

I'd prefer to use FxHashMap and FxHashSet in the code; the diff gets bigger but the code would be more readable IMO (e.g. it's not necessary to jump to the top of the file to check if HashMap is really HashMap or FxHashMap).

I had also considered this, but initially decided against it for two reasons:

The bigger diff has the potential to cause major merge conflicts for other very large PRs that are currently in progress. E.g. Cache verification of unchanged methods #799, Refactoring: more VIR layers for nicer core proof encoding. #805 and probably others, too. I was worried that it would not be great "git etiquette" and that it might frustrate others when they get such a major merge conflict 😄

The FxHashMap and FxHashSet are just type aliases to std::collections::{HashMap, HashSet}. The last type parameter is different, though. Fx uses BuildHasherDefault<FxHasher> and std uses RandomState as the default type parameter. This means the APIs are identical for all but 2 methods: HashMap::with_capacity and HashMap::new are only defined for HashMaps that have the RandomState type parameter (which is the reason I had to replace HashMap::new() with HashMap::default() everywhere). Since the API surface belongs to one and the same HashMap/HashSet, the times a developer needs to double check at the top would be minimal.

However, if it's still better for the maintainability to replace all instances with FxHashMap / FxHashSet everywhere, I'd be happy to make to change. I just wanted to run my motivation by you first to be sure that making the change will have the desired outcome :)

Is there any reason to use rustc_hash::{FxHashMap, FxHashSet rather than rustc_data_structures::fx::{FxHashMap, FxHashSet}?

Good catch, there is no reason, I only just realized that the rustc_data_structures crate is provided by the compiler (it also depends on the rustc_hash crate, which is also provided by the compiler) when the rustc_private feature flag is enabled. I'll update this! Would it be acceptable to add #![feature(rustc_private)] to the vir and viper creates, too, or should we link it through Cargo.toml instead? It looks like that feature flag won't be stabilized anytime soon, so it might be better to avoid it where it's not needed for other reasons.

Thanks!

I'm not the one working on the other PRs, but the conflicts seem okay to me. Thanks for thinking at it :) Changing HashMap::new to FxHashMap::default vs HashMap::default should roughly be the same. There will be additional type declarations to be fixed, but they shouldn't be too many.

The API is the same, yes, but when looking at a PR it's easier to spot HashMap and comment "consider using FxHashMap because it's faster and more deterministic" rather than checking the use ... that might be or not in the diff. I tried to quantify this practice: rust-lang/rust, rust-lang/rust-analyzer and rust-lang/clippy never import FxHashMap/Set as ..., while other smaller crates do. I would stick with the convention that the compiler developers use.

Great, that works for me. I'll make the changes. Thanks! :)

vakaras · 2022-01-04T10:52:19Z

Would it be acceptable to add #![feature(rustc_private)] to the vir and viper creates, too, or should we link it through Cargo.toml instead?

I would prefer to avoid #![feature(rustc_private)] if possible.

fpoli · 2022-01-04T12:17:58Z

Would it be acceptable to add #![feature(rustc_private)] to the vir and viper creates, too, or should we link it through Cargo.toml instead?

I would prefer to avoid #![feature(rustc_private)] if possible.

In that case, feel free to ignore my suggestion of using rustc_data_structures. If the FxHash* types provided by the two crates are not known to be equal, converting between them sounds painful.

JonasAlaif · 2022-01-06T08:32:28Z

This is awesome, thanks! I agree with @fpoli that this kind of merge conflict is fine, and with @vakaras comment. I'll merge the current state of this PR into #799 for now, and then once more when this is merged into master.

Pointerbender · 2022-01-06T17:41:29Z

I just pushed all the requested changes :) This is ready to merge pending CI and peer review.

I would prefer to avoid #![feature(rustc_private)] if possible.

I managed to avoid adding extra #![feature(rustc_private)].

In that case, feel free to ignore my suggestion of using rustc_data_structures. If the FxHash* types provided by the two crates are not known to be equal, converting between them sounds painful.

I ended up sticking with the rustc_hash crate in Cargo.toml. Indeed the compiler complains about the types not being equal. The only place a conversion between rustc_hash::FxHashMap and rustc_data_structures::fx::FxHashMap happens is in prusti-interface/src/environment/polonius_info.rs. This was already there, but I called out the full type for rustc_data_structures::fx::FxHashMap where it's used to be able to easily distinguish its use.

This is awesome, thanks! I agree with @fpoli that this kind of merge conflict is fine, and with @vakaras comment. I'll merge the current state of this PR into #799 for now, and then once more when this is merged into master.

Thanks! By the way, the caching works really well, I've been playing around with it for the last week or so and I'm very happy with the extra productivity gained with this new feature. Thank you very much for adding it!

Pointerbender · 2022-01-06T18:53:15Z

The CI test for prusti-tests/tests/verify/pass/arrays/selection_sort.rs failed with 2 errors:

[Prusti: verification error] loop invariant might not hold after a loop iteration that preserves the loop condition.
Prusti failed to finish within 180 seconds. It finished in 418.550898266s.

I can't reproduce the failure on my local machine (also Ubuntu, running ./x.py test pass/arrays/selection_sort.rs finishes successfully in 2m46s). I think this may have been a spurious timeout issue due to Z3 starting off in the wrong direction of the search space. If so, then simply re-running the CI will hopefully make it pass the next time around. I don't have a restart button on my end, could anyone please restart the CI for this PR? :) Thanks!

Pointerbender · 2022-01-06T20:09:35Z

The same test failed again due to the same timeout error on Ubuntu, looks like the determinism is working well now, haha. Is there a way to set a different (fixed) Z3 random seed for a particular test?

vakaras · 2022-01-06T21:19:40Z

The same test failed again due to the same timeout error on Ubuntu, looks like the determinism is working well now, haha. Is there a way to set a different (fixed) Z3 random seed for a particular test?

That should be possible, at least in theory. However, the right thing to do here would be to nail down why exactly the test fails and fix the problem (even though hunting performance bugs is insanely annoying). Typically such large performance variations are caused by matching loops. @fpoli If I am not mistaken you had used AxiomProfiler recently, would you mind adding some brief instructions on how to use it?

Regarding merging this PR, personally, I would be also fine with ignoring the failing test and opening an issue that the test is broken.

fpoli · 2022-01-07T09:03:34Z

The same test failed again due to the same timeout error on Ubuntu, looks like the determinism is working well now, haha. Is there a way to set a different (fixed) Z3 random seed for a particular test?

That should be possible, at least in theory. However, the right thing to do here would be to nail down why exactly the test fails and fix the problem (even though hunting performance bugs is insanely annoying).

The test suite supports the // compile-flags: annotation, using which it should be possible to pass extra arguments to Silicon via -Pextra_verifier_args=... (if setting vector of strings is supported by the -P machinery; I'm not sure). One of Silicon's arguments allows to pass extra arguments to Z3, which can be used to set its various seeds. However, it would be very fragile and it would break whenever we upgrade Silicon or Z3. I agree with Vytautas that we should either fix the performance issue or disable the test.

Typically such large performance variations are caused by matching loops. @fpoli If I am not mistaken you had used AxiomProfiler recently, would you mind adding some brief instructions on how to use it?

I didn't; my logs were too big (my timeout was 30 minutes) and the tool kept crashing on Linux. However, this failing test in Prusti seems much smaller than that. The readme has instructions on how to run the tool and there is a working Docker image that can be used even from Mac. To use the profiler, see here; they link to the paper that describes the tool. The "Help" menu of the tool also contains useful instructions.

Alternatively, Z3's smt.qi.profile=true flag is very promising. Its output is explained here and with a bit of processing I used it to see how many times each quantifier is instantiated. You need to obtain first from Silicon the smt2 file with the commands sent to Z3, making sure that you can reproduce the performance issue just by running Z3 on it.

Pointerbender · 2022-01-07T09:14:21Z

Thanks for the tips! Here is the plan of approach:

Create a separate issue for tracking down the performance regression for the selection_sort.rs test case (I'll reference the exact commit where it's failing from that issue in case we want to reproduce it in the future).
Merge master into my branch to resolve the recent merge conflicts.
Ignore the selection_sort.rs test case for now so that we can investigate it separately.

I also have an idea how we could detect (potential) matching loops automatically. Such a feature would greatly improve end-user experience and could save us from hours of fun debugging :) I'll see if I can open an issue in the next week or so that describes it in detail in order to solicit some feedback!

… case, see issue viperproject#819

Pointerbender · 2022-01-07T10:33:21Z

I created issue #819 to track the timeout issue and pushed the latest changes. When the CI passes this should be good to merge!

fpoli · 2022-01-07T11:50:22Z

Awesome, thanks!

Pointerbender added 2 commits January 2, 2022 21:30

minimum changes in order to use FxHashSet and FxHashMap in `prusti-vi…

904a6a1

…per`

rustfmt and clippy fixes

f715e86

Pointerbender mentioned this pull request Jan 2, 2022

Cache verification of unchanged methods #799

Merged

fpoli reviewed Jan 4, 2022

View reviewed changes

Pointerbender added 6 commits January 6, 2022 17:23

use FxHashSet and FxHashMap in prusti-interface instead of an alias

2cbaaa5

use FxHashMap and FxHashSet in prusti-viper instead of an alias

4d28873

undo FxHash aliases for viper and vir crates

7ef2b8e

use original naming for Fx instead of alias in vir/defs/

b8433c5

removed last alias from polonius_info

97e17a7

rustfmt and clippy fixes

7e259b8

Pointerbender mentioned this pull request Jan 7, 2022

verify/pass/arrays/{merge,selection_sort}.rs timeout issue #819

Open

Pointerbender added 2 commits January 7, 2022 11:14

Merge branch 'master' into issue-711

940d307

ignore prusti-tests/tests/verify/pass/arrays/selection_sort.rs test…

6126957

… case, see issue viperproject#819

fpoli merged commit 46ed335 into viperproject:master Jan 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

minimum changes needed in order to use FxHashSet and FxHashMap in `prusti-viper` #817

minimum changes needed in order to use FxHashSet and FxHashMap in `prusti-viper` #817

Pointerbender commented Jan 2, 2022

fpoli Jan 4, 2022 •

edited

Loading

Pointerbender Jan 4, 2022

fpoli Jan 4, 2022

Pointerbender Jan 4, 2022

vakaras commented Jan 4, 2022

fpoli commented Jan 4, 2022 •

edited

Loading

JonasAlaif commented Jan 6, 2022

Pointerbender commented Jan 6, 2022

Pointerbender commented Jan 6, 2022

Pointerbender commented Jan 6, 2022

vakaras commented Jan 6, 2022

fpoli commented Jan 7, 2022

Pointerbender commented Jan 7, 2022

Pointerbender commented Jan 7, 2022

fpoli commented Jan 7, 2022

minimum changes needed in order to use FxHashSet and FxHashMap in prusti-viper #817

minimum changes needed in order to use FxHashSet and FxHashMap in prusti-viper #817

Conversation

Pointerbender commented Jan 2, 2022

fpoli Jan 4, 2022 • edited Loading

Choose a reason for hiding this comment

Pointerbender Jan 4, 2022

Choose a reason for hiding this comment

fpoli Jan 4, 2022

Choose a reason for hiding this comment

Pointerbender Jan 4, 2022

Choose a reason for hiding this comment

vakaras commented Jan 4, 2022

fpoli commented Jan 4, 2022 • edited Loading

JonasAlaif commented Jan 6, 2022

Pointerbender commented Jan 6, 2022

Pointerbender commented Jan 6, 2022

Pointerbender commented Jan 6, 2022

vakaras commented Jan 6, 2022

fpoli commented Jan 7, 2022

Pointerbender commented Jan 7, 2022

Pointerbender commented Jan 7, 2022

fpoli commented Jan 7, 2022

minimum changes needed in order to use FxHashSet and FxHashMap in `prusti-viper` #817

minimum changes needed in order to use FxHashSet and FxHashMap in `prusti-viper` #817

fpoli Jan 4, 2022 •

edited

Loading

fpoli commented Jan 4, 2022 •

edited

Loading