Add frozen support to roaring64 #688

SLieve · 2025-01-11T16:18:18Z

This PR makes the following changes:

Change the internals of both ART and roaring64 to be based on indices into arrays rather than pointers.
Use uint64_t as the value of the ART rather than a pointer, which allows the ART to self-contain its serialization code.
Add support for frozen (de)serialization. This is done by serializing the arrays of objects and indices into them, which means that during deserialization the calls to memcpy are limited and the bitmap can mostly act as a view into a buffer.
Add some benchmarks using synthetic data for roaring64 / roaring 64 C++ / std::set. These benchmarks focus more on the higher-order bits than the current microbenchmarks, and thus are a better test case for 64-bit bitmaps.

This is a large change so there is lots of room for mistakes, please let me know what can be improved.

lemire

The CI failure (VS-clang) is unrelated. This PR looks good.

Dr-Emann · 2025-01-12T02:24:28Z

include/roaring/roaring64.h

+/**
+ * Creates a readonly bitmap that is a view of the given buffer. The buffer
+ * should be created with `roaring64_bitmap_frozen_serialize()`, and must be
+ * aligned by 64 bytes.


Is 64 byte alignment really required? It seems 8 byte (64 bit) alignment is more likely

As far as I can tell from here, yes.

@SLieve I am not sure that it is required. I think it is preferred (for performance).

include/roaring/roaring64.h

src/art/art.c

This adds the index of the next free node into a newly freed node, or `capacity` if there are no more free indices. This significantly speeds up finding the next free index, which is important for add+remove workloads. Benchmarks Old: ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ r64InsertRemoveRandom/0 127 ns 127 ns 5461079 r64InsertRemoveRandom/1 31633 ns 31604 ns 24028 r64InsertRemoveRandom/2 30782 ns 30769 ns 21859 r64InsertRemoveRandom/3 31985 ns 31969 ns 21558 r64InsertRemoveRandom/4 356 ns 356 ns 1962694 r64InsertRemoveRandom/5 28972 ns 28962 ns 21366 r64InsertRemoveRandom/6 30632 ns 30623 ns 22682 r64InsertRemoveRandom/7 448 ns 448 ns 1601550 r64InsertRemoveRandom/8 32506 ns 32495 ns 21591 r64InsertRemoveRandom/9 689 ns 689 ns 1002237 cppInsertRemoveRandom/0 131 ns 131 ns 5319673 cppInsertRemoveRandom/1 16106 ns 16104 ns 43632 cppInsertRemoveRandom/2 3881 ns 3881 ns 180087 cppInsertRemoveRandom/3 3582 ns 3582 ns 171298 cppInsertRemoveRandom/4 403 ns 402 ns 1666697 cppInsertRemoveRandom/5 993 ns 993 ns 706038 cppInsertRemoveRandom/6 4039 ns 4038 ns 172421 cppInsertRemoveRandom/7 469 ns 469 ns 1440197 cppInsertRemoveRandom/8 1454 ns 1454 ns 633551 cppInsertRemoveRandom/9 654 ns 654 ns 1091588 setInsertRemoveRandom/0 1944 ns 1943 ns 368926 setInsertRemoveRandom/1 1955 ns 1953 ns 404931 setInsertRemoveRandom/2 1911 ns 1910 ns 358466 setInsertRemoveRandom/3 1953 ns 1951 ns 362351 setInsertRemoveRandom/4 2104 ns 2102 ns 321387 setInsertRemoveRandom/5 1944 ns 1943 ns 354836 setInsertRemoveRandom/6 1835 ns 1835 ns 359099 setInsertRemoveRandom/7 1970 ns 1968 ns 372625 setInsertRemoveRandom/8 1894 ns 1892 ns 355456 setInsertRemoveRandom/9 1659 ns 1659 ns 355902 New: ------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ r64InsertRemoveRandom/0 128 ns 128 ns 5614266 r64InsertRemoveRandom/1 935 ns 935 ns 739679 r64InsertRemoveRandom/2 916 ns 916 ns 739944 r64InsertRemoveRandom/3 936 ns 936 ns 690708 r64InsertRemoveRandom/4 368 ns 368 ns 1957642 r64InsertRemoveRandom/5 1141 ns 1140 ns 592505 r64InsertRemoveRandom/6 1139 ns 1138 ns 657840 r64InsertRemoveRandom/7 481 ns 481 ns 1434967 r64InsertRemoveRandom/8 1447 ns 1446 ns 484463 r64InsertRemoveRandom/9 721 ns 721 ns 1017456 cppInsertRemoveRandom/0 134 ns 134 ns 5524804 cppInsertRemoveRandom/1 15616 ns 15608 ns 47666 cppInsertRemoveRandom/2 3855 ns 3854 ns 180265 cppInsertRemoveRandom/3 3809 ns 3808 ns 183595 cppInsertRemoveRandom/4 412 ns 412 ns 1695708 cppInsertRemoveRandom/5 1012 ns 1011 ns 713501 cppInsertRemoveRandom/6 3410 ns 3409 ns 199214 cppInsertRemoveRandom/7 474 ns 474 ns 1496740 cppInsertRemoveRandom/8 1421 ns 1420 ns 465868 cppInsertRemoveRandom/9 564 ns 564 ns 1148076 setInsertRemoveRandom/0 1956 ns 1956 ns 351283 setInsertRemoveRandom/1 1959 ns 1958 ns 355766 setInsertRemoveRandom/2 1886 ns 1885 ns 357406 setInsertRemoveRandom/3 1905 ns 1904 ns 355235 setInsertRemoveRandom/4 1945 ns 1944 ns 364599 setInsertRemoveRandom/5 1902 ns 1902 ns 350312 setInsertRemoveRandom/6 1907 ns 1906 ns 346962 setInsertRemoveRandom/7 1937 ns 1936 ns 356168 setInsertRemoveRandom/8 1881 ns 1880 ns 341472 setInsertRemoveRandom/9 1962 ns 1961 ns 350643

src/art/art.c

src/roaring64.c

Dr-Emann · 2025-01-14T06:29:45Z

src/art/art.c

+    if (art->root != CROARING_ART_NULL_REF) {
+        art->root = art_move_node_to_shrink(art, art->root);
+        art_shrink_at(art, art->root);
+    }
+    return art_shrink_node_arrays(art);


I don't think this works. I need to think more to verify, but imagine the case where we have (all nodes of size 4):

Capacity: 8
Root: idx 5
Free List: 6 -> 4 -> 8

The root will not be moved (because it's before first_free). At some point, the item at idx 7 will be moved into slot 6. But the item at 5 cannot be moved into free slot 4, because we already visited the root.

This avoids a bug in the following scenario: art->leaves = [2,0,x] art->first_free[leaf_type] = 1 Where `2` and `0` are pointers to the next free index, and `x` is an occupied leaf. In this case, if `art_shrink_to_fit` was called, then we would have the following result: art->leaves = [2,x,0] art->first_free[leaf_type] = 0 This is not fully shrunken, and therefore wrong. Sorting the free indices fixes this scenario. Before `art_shrink_to_fit`: art->leaves = [1,2,x] art->first_free[leaf_type] = 0 After `art_shrink_to_fit`: art->leaves = [x,2,3] art->first_free[leaf_type] = 1

SLieve added 5 commits January 11, 2025 15:35

Array-backed ART

b8aac30

Array-backed r64

04e5eea

ART serialization

ac3d16f

r64 frozen serialization

6c2eb79

Synthetic benchmarks for r64

c669fb0

SLieve requested review from lemire and Dr-Emann January 11, 2025 16:18

lemire approved these changes Jan 11, 2025

View reviewed changes

Dr-Emann reviewed Jan 12, 2025

View reviewed changes

SLieve added 3 commits January 12, 2025 13:47

Address review comments

fcbe947

Add random insert / remove benchmark

9ae9221

SLieve force-pushed the frozen2 branch from 973a643 to 23a8441 Compare January 12, 2025 19:06

Dr-Emann reviewed Jan 14, 2025

View reviewed changes

SLieve marked this pull request as draft January 26, 2025 19:55

SLieve force-pushed the frozen2 branch from 50aecc1 to 8dc1d11 Compare January 26, 2025 20:26

SLieve added 2 commits January 26, 2025 20:33

Minor cleanups to ART and r64 internals

145616c

SLieve force-pushed the frozen2 branch from 8dc1d11 to 145616c Compare January 26, 2025 20:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add frozen support to roaring64 #688

Add frozen support to roaring64 #688

SLieve commented Jan 11, 2025

lemire left a comment

Dr-Emann Jan 12, 2025

SLieve Jan 12, 2025

lemire Jan 13, 2025

Dr-Emann Jan 14, 2025 •

edited

Loading

Add frozen support to roaring64 #688

Are you sure you want to change the base?

Add frozen support to roaring64 #688

Conversation

SLieve commented Jan 11, 2025

lemire left a comment

Choose a reason for hiding this comment

Dr-Emann Jan 12, 2025

Choose a reason for hiding this comment

SLieve Jan 12, 2025

Choose a reason for hiding this comment

lemire Jan 13, 2025

Choose a reason for hiding this comment

Dr-Emann Jan 14, 2025 • edited Loading

Choose a reason for hiding this comment

Dr-Emann Jan 14, 2025 •

edited

Loading