-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: teach ALPArray to store validity only in the encoded array & other minor changes #2053
Conversation
previously: #1951 |
Benchmarks: random_access |
Benchmarks: datafusionTable of Results
|
Benchmarks: ClickbenchTable of Results
|
encodings/alp/src/alp/compress.rs
Outdated
let (encoded, exceptional_positions) = T::chunked_encode(values.as_slice::<T>(), exponents); | ||
|
||
let encoded_array = PrimitiveArray::new(encoded, values.validity()).into_array(); | ||
let exceptional_positions = match values.logical_validity() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we add like 1-2 comments in this section just to make it clear what is going on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LMK what you think, I added a comment above this line.
b5b44e0
to
ca17b75
Compare
This comment was marked as outdated.
This comment was marked as outdated.
@danking you should make sure you run with RUSTFLAGS='-C target-cpu=native' we had to axe it from our repo cargo config since it messed up cross compilation |
This comment was marked as outdated.
This comment was marked as outdated.
Benchmarks: compressTable of Results
|
The patches are now always non-nullable. This required PrimitiveArray::patch to gracefully handle non-nullable patches when the array is nullable. I modified the benchmarks to include patch manipulation time, but notice that the test data has no patches. The benchmarks measure the overhead of `is_valid`. If we had test data where the invalid positions contained exceptional values, I would expect a modest improvement in both decompression and compression time.
0351f19
to
e5709ef
Compare
encodings/alp/src/alp/mod.rs
Outdated
if let Some(fill_value) = | ||
Self::first_non_patched_encoded_value(&encoded, &patch_indices) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just use unwrap_or_default() to fallback to zero
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the original code also does this: https://github.com/cwida/ALP/blob/main/include/alp/encoder.hpp#L379-L385
encoded_output.extend(chunk.iter().map(|v| { | ||
let encoded = unsafe { T::encode_single_unchecked(*v, exp) }; | ||
let decoded = T::decode_single(encoded, exp); | ||
let neq = (decoded != *v) as usize; | ||
chunk_patch_count += neq; | ||
encoded | ||
})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop structure is actually pretty important for performance.
You can refer to the original code where this is actually separated out into two loops:
Loop 1: calculate and materialize encoded + decoded vectors
Loop 2: find the exceptions by looping through the zipped vectors to find mismatches
https://github.com/cwida/ALP/blob/main/include/alp/encoder.hpp#L338-L346
https://github.com/cwida/ALP/blob/main/include/alp/encoder.hpp#L370-L376
I'm not sure i personally find the new version on the right more readable, but I want to make sure we keep a similar structure that the branch predictor likes
e0e2528
to
c2580f0
Compare
The original change of this PR trims invalid values from the patches and makes the patches validity either AllValid (for nullable arrays) or NonNullable.
Benchmarks on latest commit:
parameter is: (number of elements, fraction patched, fraction valid).
Any ratio greater than 1.1 or less than 0.9 has a
***
Benchmarks before reverting to develop's chunking code
[1] Seems like this PR is about the same except for compressing really large f64 arrays. The PR that introduced chunking, #924, reported substantially larger reductions (~5ms of 29ms) in time than this increase of ~1ms (of 17ms).