Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automatic shrinking of hash table capacity is very expensive #17645

Closed
thestinger opened this issue Sep 30, 2014 · 9 comments
Closed

automatic shrinking of hash table capacity is very expensive #17645

thestinger opened this issue Sep 30, 2014 · 9 comments
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@thestinger
Copy link
Contributor

This interacts poorly with with_capacity since it will just end up shrinking the allocation immediately. It's very widely used since FromIter uses it with the iterator size hint. Even in code that's not pre-allocating capacity, the resize strategy results in many reallocations as the hash table shrinks and then grows again. Allocator improvements are possible but this is always going to be extremely expensive for huge collections on most platforms.

@thestinger thestinger added A-libs I-slow Issue: Problems and improvements with respect to performance of generated code. labels Sep 30, 2014
@Gankra
Copy link
Contributor

Gankra commented Sep 30, 2014

If you ask for a hashmap with_capacity, it will record that number as a minimum and never shrink the map below that.

@thestinger
Copy link
Contributor Author

I think it's trying to be too clever. It will always have problems reminiscent of the segmented stack thrashing issues. It would be better to just have the caller use shrink_to_fit as is done for vectors. There's a lot of value in having simple and predictable performance characteristics. Implementing jemalloc/jemalloc#134 would reduce the problem smaller for huge hash tables but I think it's still too aggressive as a default.

@Gankra
Copy link
Contributor

Gankra commented Sep 30, 2014

cc @pczarn

@jfager
Copy link
Contributor

jfager commented Sep 30, 2014

Automatic shrinking is pretty surprising behavior, and only seems to be hinted at in the rustdoc via the reserve method. +1 to stop doing it.

@mitsuhiko
Copy link
Contributor

+1 on explicit calls for more predictable performance. I was quite surprised to learn that this is what it does currently.

@arthurprs
Copy link
Contributor

+1 same as @mitsuhiko

@pczarn
Copy link
Contributor

pczarn commented Oct 10, 2014

I don't mind explicit calls. This is a simple approach. ResizePolicy would allow opt-in automatic shrinking in the future.

Just keep in mind that the capacity of a hash table affects the performance of accesses, even though it's more predicable. (Is it?)

@pczarn
Copy link
Contributor

pczarn commented Jan 14, 2015

Fixed by #18770.

@Gankra Gankra closed this as completed Jan 14, 2015
@Gankra
Copy link
Contributor

Gankra commented Jan 14, 2015

🎊

lnicola pushed a commit to lnicola/rust that referenced this issue Jul 28, 2024
Fix path resolution for child mods of those expanded by `include!`

Child modules wouldn't use the correct candidate paths due to a branch that doesn't seem to be doing what it's intended to do. Removing the branch fixes the problem and all existing test cases pass.

Having no knowledge of how any of this works, I believe this fixes rust-lang#17645. Using another test that writes the included mod directly into `lib.rs` instead, I found the difference can be traced to the candidate files we use to look up mods. A separate branch for if the file comes from an `include!` macro doesn't take into account the original mod we're contained within:

```rust
None if file_id.macro_file().map_or(false, |it| it.is_include_macro(db.upcast())) => {
    candidate_files.push(format!("{}.rs", name.display(db.upcast())));
    candidate_files.push(format!("{}/mod.rs", name.display(db.upcast())));
}
```

I'm not sure why this branch exists. Tracing the branch back takes us to 3bb9efb but it doesn't say *why* the branch was added. The test case that was added in this commit passes with the branch removed, so I think it's just superfluous at this point.
RalfJung pushed a commit to RalfJung/rust that referenced this issue Aug 1, 2024
Fix path resolution for child mods of those expanded by `include!`

Child modules wouldn't use the correct candidate paths due to a branch that doesn't seem to be doing what it's intended to do. Removing the branch fixes the problem and all existing test cases pass.

Having no knowledge of how any of this works, I believe this fixes rust-lang#17645. Using another test that writes the included mod directly into `lib.rs` instead, I found the difference can be traced to the candidate files we use to look up mods. A separate branch for if the file comes from an `include!` macro doesn't take into account the original mod we're contained within:

```rust
None if file_id.macro_file().map_or(false, |it| it.is_include_macro(db.upcast())) => {
    candidate_files.push(format!("{}.rs", name.display(db.upcast())));
    candidate_files.push(format!("{}/mod.rs", name.display(db.upcast())));
}
```

I'm not sure why this branch exists. Tracing the branch back takes us to 3bb9efb but it doesn't say *why* the branch was added. The test case that was added in this commit passes with the branch removed, so I think it's just superfluous at this point.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

No branches or pull requests

6 participants