-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
complete btree rewrite #17334
complete btree rewrite #17334
Conversation
I also tossed in quick-n-dirty implementation of my Entry API RFC since that landed while I was working on this. I'm considering some minor revisions to that API with respect to function signatures, so that might get changed. Broad strokes should stay the same, though. |
Last commit builds/tests perfectly! Bench results:
Better than TreeMap in almost every category, with the inefficient representation! \o/ |
/// Takes a Vec, and splits the last `amount` elements into a new one | ||
fn split<T>(left: &mut Vec<T>, amount: uint) -> Vec<T> { | ||
// FIXME(Gankro): gross gross gross gross | ||
// Vec should probably have a method for this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you looked at the code produced for this method? Because of the manual copy and double reverse, I wouldn't be surprised if this slowed down inserts measurably.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it's a FIXME for a reason. I'm not sure there's anything I can really do, short of adding a method to Vec, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gereeter I found a working solution using unsafe code. Yay!
Deprecates the `find_or_*` family of "internal mutation" methods on `HashMap` in favour of the "external mutation" Entry API as part of RFC 60. Part of rust-lang#17320, although this still needs to be done on `TreeMap` and `BTree`. Work on `TreeMap` is TBD, and `BTree`'s work is part of the complete rewrite in rust-lang#17334. The implemented API deviates from the API described in the RFC in two key places: * `VacantEntry.set` yields a mutable reference to the inserted element to avoid code duplication where complex logic needs to be done *regardless* of whether the entry was vacant or not. * `OccupiedEntry.into_mut` was added so that it is possible to return a reference into the map beyond the lifetime of the Entry itself, providing functional parity to `VacantEntry.set`. [breaking-change]
pub fn entry<'a>(&'a mut self, key: K) -> Entry<'a, K, V> { | ||
// same basic logic of `swap` and `pop` | ||
let mut visit_stack = Vec::with_capacity(self.depth); | ||
let mut found = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason for using an explicit found
flag instead of just returning where you currently break
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a node (and therefore self) is mutably borrowed inside the loop. You have to break out to then borrow self into the returned structs. Although maybe there's a better way I don't see?
// (ptr, cap, size). This could also have cache benefits for very small nodes, as keys | ||
// could bleed into edges and vals. | ||
// | ||
// However doing this would require *a lot* of code to reimplement all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Nitpick: not that much, since these are presumably fixed sized and don't need all the dynamic modification/helper functions.)
let mut stack = self.stack; | ||
|
||
// Remove the key-value pair from the leaf, check if the node is underfull, and then | ||
// promptly forget the leaf and ptr to avoid ownership issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the 'leaf and ptr' talking about here exactly?
@huonw I believe I've addressed your latest batch of issues (in "more fixups"). Rebased, and am currently building locally to verify everything is a-okay. |
@huonw Ah, I didn't notice one of your comment batches until now. Addressed. |
As far as adding I believe that @pcwalton is currently working on, or has submitted a PR for, fixing up I suggest we transition this API toward deref after that lands. Thoughts? |
Fine by me. |
b576db6
to
e1957ae
Compare
Replaces BTree with BTreeMap and BTreeSet, which are completely new implementations. BTreeMap's internal Node representation is particularly inefficient at the moment to make this first implementation easy to reason about and fairly safe. Both collections are also currently missing some of the tooling specific to sorted collections, which is planned as future work pending reform of these APIs. General implementation issues are discussed with TODOs internally Perf results on x86_64 Linux: test treemap::bench::find_rand_100 ... bench: 76 ns/iter (+/- 4) test treemap::bench::find_rand_10_000 ... bench: 163 ns/iter (+/- 6) test treemap::bench::find_seq_100 ... bench: 77 ns/iter (+/- 3) test treemap::bench::find_seq_10_000 ... bench: 115 ns/iter (+/- 1) test treemap::bench::insert_rand_100 ... bench: 111 ns/iter (+/- 1) test treemap::bench::insert_rand_10_000 ... bench: 996 ns/iter (+/- 18) test treemap::bench::insert_seq_100 ... bench: 486 ns/iter (+/- 20) test treemap::bench::insert_seq_10_000 ... bench: 800 ns/iter (+/- 15) test btree::map::bench::find_rand_100 ... bench: 74 ns/iter (+/- 4) test btree::map::bench::find_rand_10_000 ... bench: 153 ns/iter (+/- 5) test btree::map::bench::find_seq_100 ... bench: 82 ns/iter (+/- 1) test btree::map::bench::find_seq_10_000 ... bench: 108 ns/iter (+/- 0) test btree::map::bench::insert_rand_100 ... bench: 220 ns/iter (+/- 1) test btree::map::bench::insert_rand_10_000 ... bench: 620 ns/iter (+/- 16) test btree::map::bench::insert_seq_100 ... bench: 411 ns/iter (+/- 12) test btree::map::bench::insert_seq_10_000 ... bench: 534 ns/iter (+/- 14) BTreeMap still has a lot of room for optimization, but it's already beating out TreeMap on most access patterns. [breaking-change]
Eugh, didn't know that stuff wasn't picked up by make check-collections. Fixed. |
Replaces BTree with BTreeMap and BTreeSet, which are completely new implementations. BTreeMap's internal Node representation is particularly inefficient at the moment to make this first implementation easy to reason about and fairly safe. Both collections are also currently missing some of the tooling specific to sorted collections, which is planned as future work pending reform of these APIs. General implementation issues are discussed with TODOs internally [breaking-change] Still waiting on compilation/test/bench stuff locally, but the edit-distance on any errors should be very small at this point. This is ready to be reviewed.
Replaces BTree with BTreeMap and BTreeSet, which are completely new implementations.
BTreeMap's internal Node representation is particularly inefficient at the moment to
make this first implementation easy to reason about and fairly safe. Both collections
are also currently missing some of the tooling specific to sorted collections, which
is planned as future work pending reform of these APIs. General implementation issues
are discussed with TODOs internally
[breaking-change]
Still waiting on compilation/test/bench stuff locally, but the edit-distance on any errors should be very small at this point. This is ready to be reviewed.