-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
minimum changes needed in order to use FxHashSet and FxHashMap in prusti-viper
#817
Conversation
@@ -10,7 +10,7 @@ use crate::environment::procedure::BasicBlockIndex; | |||
use rustc_middle::mir; | |||
use rustc_middle::mir::visit::Visitor; | |||
use rustc_data_structures::graph::dominators::Dominators; | |||
use std::collections::{HashMap, HashSet}; | |||
use rustc_hash::{FxHashMap as HashMap, FxHashSet as HashSet}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to use FxHashMap
and FxHashSet
in the code; the diff gets bigger but the code would be more readable IMO (e.g. it's not necessary to jump to the top of the file to check if HashMap
is really HashMap
or FxHashMap
).
Is there any reason to use rustc_hash::{FxHashMap, FxHashSet
rather than rustc_data_structures::fx::{FxHashMap, FxHashSet}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
I'd prefer to use FxHashMap and FxHashSet in the code; the diff gets bigger but the code would be more readable IMO (e.g. it's not necessary to jump to the top of the file to check if HashMap is really HashMap or FxHashMap).
I had also considered this, but initially decided against it for two reasons:
- The bigger diff has the potential to cause major merge conflicts for other very large PRs that are currently in progress. E.g. Cache verification of unchanged methods #799, Refactoring: more VIR layers for nicer core proof encoding. #805 and probably others, too. I was worried that it would not be great "git etiquette" and that it might frustrate others when they get such a major merge conflict 😄
- The
FxHashMap
andFxHashSet
are just type aliases tostd::collections::{HashMap, HashSet}
. The last type parameter is different, though. Fx usesBuildHasherDefault<FxHasher>
andstd
usesRandomState
as the default type parameter. This means the APIs are identical for all but 2 methods:HashMap::with_capacity
andHashMap::new
are only defined for HashMaps that have theRandomState
type parameter (which is the reason I had to replaceHashMap::new()
withHashMap::default()
everywhere). Since the API surface belongs to one and the sameHashMap
/HashSet
, the times a developer needs to double check at the top would be minimal.
However, if it's still better for the maintainability to replace all instances with FxHashMap
/ FxHashSet
everywhere, I'd be happy to make to change. I just wanted to run my motivation by you first to be sure that making the change will have the desired outcome :)
Is there any reason to use rustc_hash::{FxHashMap, FxHashSet rather than rustc_data_structures::fx::{FxHashMap, FxHashSet}?
Good catch, there is no reason, I only just realized that the rustc_data_structures
crate is provided by the compiler (it also depends on the rustc_hash
crate, which is also provided by the compiler) when the rustc_private
feature flag is enabled. I'll update this! Would it be acceptable to add #![feature(rustc_private)]
to the vir
and viper
creates, too, or should we link it through Cargo.toml
instead? It looks like that feature flag won't be stabilized anytime soon, so it might be better to avoid it where it's not needed for other reasons.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not the one working on the other PRs, but the conflicts seem okay to me. Thanks for thinking at it :) Changing HashMap::new
to FxHashMap::default
vs HashMap::default
should roughly be the same. There will be additional type declarations to be fixed, but they shouldn't be too many.
The API is the same, yes, but when looking at a PR it's easier to spot HashMap
and comment "consider using FxHashMap because it's faster and more deterministic" rather than checking the use ...
that might be or not in the diff. I tried to quantify this practice: rust-lang/rust
, rust-lang/rust-analyzer
and rust-lang/clippy
never import FxHashMap/Set as ...
, while other smaller crates do. I would stick with the convention that the compiler developers use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, that works for me. I'll make the changes. Thanks! :)
I would prefer to avoid |
In that case, feel free to ignore my suggestion of using |
I just pushed all the requested changes :) This is ready to merge pending CI and peer review.
I managed to avoid adding extra
I ended up sticking with the
Thanks! By the way, the caching works really well, I've been playing around with it for the last week or so and I'm very happy with the extra productivity gained with this new feature. Thank you very much for adding it! |
The CI test for
I can't reproduce the failure on my local machine (also Ubuntu, running |
The same test failed again due to the same timeout error on Ubuntu, looks like the determinism is working well now, haha. Is there a way to set a different (fixed) Z3 random seed for a particular test? |
That should be possible, at least in theory. However, the right thing to do here would be to nail down why exactly the test fails and fix the problem (even though hunting performance bugs is insanely annoying). Typically such large performance variations are caused by matching loops. @fpoli If I am not mistaken you had used AxiomProfiler recently, would you mind adding some brief instructions on how to use it? Regarding merging this PR, personally, I would be also fine with ignoring the failing test and opening an issue that the test is broken. |
The test suite supports the
I didn't; my logs were too big (my timeout was 30 minutes) and the tool kept crashing on Linux. However, this failing test in Prusti seems much smaller than that. The readme has instructions on how to run the tool and there is a working Docker image that can be used even from Mac. To use the profiler, see here; they link to the paper that describes the tool. The "Help" menu of the tool also contains useful instructions. Alternatively, Z3's |
Thanks for the tips! Here is the plan of approach:
I also have an idea how we could detect (potential) matching loops automatically. Such a feature would greatly improve end-user experience and could save us from hours of fun debugging :) I'll see if I can open an issue in the next week or so that describes it in detail in order to solicit some feedback! |
I created issue #819 to track the timeout issue and pushed the latest changes. When the CI passes this should be good to merge! |
Awesome, thanks! |
This PR increases the use of
FxHashSet
andFxHashMap
in theprusti-viper
module (and a little invir
,vir_gen
andprusti_interface
to make it compile).The purpose of this PR is two-fold:
In several places, the Viper programs were not generated deterministically, specifically at:
These sources of non-determinism are also an issue for PR #799, which relies on computing the (deterministic) hashes of Viper programs in order to cache the verification results. This causes the cache to be under-utilized due to different program hashes for the same program between compilations. By default,
std::collections::{HashMap, HashSet}
uses a randomized seed for every run to avoid DDOS attacks.FxHashMap
andFxHashSet
are also randomly sorted, but these are not randomly seeded every run, thus the collection orders are consistent between runs.