-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add frozen support to roaring64 #688
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CI failure (VS-clang) is unrelated. This PR looks good.
/** | ||
* Creates a readonly bitmap that is a view of the given buffer. The buffer | ||
* should be created with `roaring64_bitmap_frozen_serialize()`, and must be | ||
* aligned by 64 bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is 64 byte alignment really required? It seems 8 byte (64 bit) alignment is more likely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell from here, yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SLieve I am not sure that it is required. I think it is preferred (for performance).
This adds the index of the next free node into a newly freed node, or `capacity` if there are no more free indices. This significantly speeds up finding the next free index, which is important for add+remove workloads. Benchmarks Old: ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ r64InsertRemoveRandom/0 127 ns 127 ns 5461079 r64InsertRemoveRandom/1 31633 ns 31604 ns 24028 r64InsertRemoveRandom/2 30782 ns 30769 ns 21859 r64InsertRemoveRandom/3 31985 ns 31969 ns 21558 r64InsertRemoveRandom/4 356 ns 356 ns 1962694 r64InsertRemoveRandom/5 28972 ns 28962 ns 21366 r64InsertRemoveRandom/6 30632 ns 30623 ns 22682 r64InsertRemoveRandom/7 448 ns 448 ns 1601550 r64InsertRemoveRandom/8 32506 ns 32495 ns 21591 r64InsertRemoveRandom/9 689 ns 689 ns 1002237 cppInsertRemoveRandom/0 131 ns 131 ns 5319673 cppInsertRemoveRandom/1 16106 ns 16104 ns 43632 cppInsertRemoveRandom/2 3881 ns 3881 ns 180087 cppInsertRemoveRandom/3 3582 ns 3582 ns 171298 cppInsertRemoveRandom/4 403 ns 402 ns 1666697 cppInsertRemoveRandom/5 993 ns 993 ns 706038 cppInsertRemoveRandom/6 4039 ns 4038 ns 172421 cppInsertRemoveRandom/7 469 ns 469 ns 1440197 cppInsertRemoveRandom/8 1454 ns 1454 ns 633551 cppInsertRemoveRandom/9 654 ns 654 ns 1091588 setInsertRemoveRandom/0 1944 ns 1943 ns 368926 setInsertRemoveRandom/1 1955 ns 1953 ns 404931 setInsertRemoveRandom/2 1911 ns 1910 ns 358466 setInsertRemoveRandom/3 1953 ns 1951 ns 362351 setInsertRemoveRandom/4 2104 ns 2102 ns 321387 setInsertRemoveRandom/5 1944 ns 1943 ns 354836 setInsertRemoveRandom/6 1835 ns 1835 ns 359099 setInsertRemoveRandom/7 1970 ns 1968 ns 372625 setInsertRemoveRandom/8 1894 ns 1892 ns 355456 setInsertRemoveRandom/9 1659 ns 1659 ns 355902 New: ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ r64InsertRemoveRandom/0 128 ns 128 ns 5614266 r64InsertRemoveRandom/1 935 ns 935 ns 739679 r64InsertRemoveRandom/2 916 ns 916 ns 739944 r64InsertRemoveRandom/3 936 ns 936 ns 690708 r64InsertRemoveRandom/4 368 ns 368 ns 1957642 r64InsertRemoveRandom/5 1141 ns 1140 ns 592505 r64InsertRemoveRandom/6 1139 ns 1138 ns 657840 r64InsertRemoveRandom/7 481 ns 481 ns 1434967 r64InsertRemoveRandom/8 1447 ns 1446 ns 484463 r64InsertRemoveRandom/9 721 ns 721 ns 1017456 cppInsertRemoveRandom/0 134 ns 134 ns 5524804 cppInsertRemoveRandom/1 15616 ns 15608 ns 47666 cppInsertRemoveRandom/2 3855 ns 3854 ns 180265 cppInsertRemoveRandom/3 3809 ns 3808 ns 183595 cppInsertRemoveRandom/4 412 ns 412 ns 1695708 cppInsertRemoveRandom/5 1012 ns 1011 ns 713501 cppInsertRemoveRandom/6 3410 ns 3409 ns 199214 cppInsertRemoveRandom/7 474 ns 474 ns 1496740 cppInsertRemoveRandom/8 1421 ns 1420 ns 465868 cppInsertRemoveRandom/9 564 ns 564 ns 1148076 setInsertRemoveRandom/0 1956 ns 1956 ns 351283 setInsertRemoveRandom/1 1959 ns 1958 ns 355766 setInsertRemoveRandom/2 1886 ns 1885 ns 357406 setInsertRemoveRandom/3 1905 ns 1904 ns 355235 setInsertRemoveRandom/4 1945 ns 1944 ns 364599 setInsertRemoveRandom/5 1902 ns 1902 ns 350312 setInsertRemoveRandom/6 1907 ns 1906 ns 346962 setInsertRemoveRandom/7 1937 ns 1936 ns 356168 setInsertRemoveRandom/8 1881 ns 1880 ns 341472 setInsertRemoveRandom/9 1962 ns 1961 ns 350643
src/art/art.c
Outdated
if (art->root != CROARING_ART_NULL_REF) { | ||
art->root = art_move_node_to_shrink(art, art->root); | ||
art_shrink_at(art, art->root); | ||
} | ||
return art_shrink_node_arrays(art); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this works. I need to think more to verify, but imagine the case where we have (all nodes of size 4):
Capacity: 8
Root: idx 5
Free List: 6 -> 4 -> 8
The root will not be moved (because it's before first_free
). At some point, the item at idx 7 will be moved into slot 6. But the item at 5 cannot be moved into free slot 4, because we already visited the root.
This avoids a bug in the following scenario: art->leaves = [2,0,x] art->first_free[leaf_type] = 1 Where `2` and `0` are pointers to the next free index, and `x` is an occupied leaf. In this case, if `art_shrink_to_fit` was called, then we would have the following result: art->leaves = [2,x,0] art->first_free[leaf_type] = 0 This is not fully shrunken, and therefore wrong. Sorting the free indices fixes this scenario. Before `art_shrink_to_fit`: art->leaves = [1,2,x] art->first_free[leaf_type] = 0 After `art_shrink_to_fit`: art->leaves = [x,2,3] art->first_free[leaf_type] = 1
This PR makes the following changes:
uint64_t
as the value of the ART rather than a pointer, which allows the ART to self-contain its serialization code.memcpy
are limited and the bitmap can mostly act as a view into a buffer.std::set
. These benchmarks focus more on the higher-order bits than the current microbenchmarks, and thus are a better test case for 64-bit bitmaps.This is a large change so there is lots of room for mistakes, please let me know what can be improved.