-
Notifications
You must be signed in to change notification settings - Fork 187
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Attributes can be defined as nullable. Nullable attributes require a "validity vector" buffer for both read and write queries, similar to how var-sized attributes require an additional "offsets" buffer. Both fixed and var-sized attributes may be nullable. Using the C API, attributes must be set nullable before adding them to the schema, e.g.: ``` tiledb_attribute_t* attr; tiledb_attribute_alloc(ctx, "my_attr", TILEDB_INT32, &attr); tiledb_attribute_set_nullable(ctx, attr, 1 /* nullable */); tiledb_array_schema_t* array_schema; tiledb_array_schema_alloc(ctx_, TILEDB_DENSE, &array_schema); tiledb_array_schema_add_attribute(ctx_, array_schema, attr); ``` Write queries require a validity vector (bytemap) for nullable attributes. In the below example, values "200" and "300" are null. These values may or may not be written to the disk. TileDB may treat them as garbage. ``` int32_t buffer = {100, 200, 300, 400}; uint64_t buffer_size = sizeof(buffer); uint8_t buffer_validity = {1, 0, 0, 1}; uint64_t buffer_validity_size = sizeof(buffer_validity); tiledb_query_set_buffer_nullable( ctx, query, "my_attr", buffer, buffer_size, buffer_validity, buffer_validity_size); ``` Overview: - Format version bumped from 6 to 7. - Validity vector buffers are written to their own tile, similar to how offset buffers are written to their own tile, separate from the value tile. - Currently, the "validity vector" is a bytemap in all usage (APIs, in-memory, and on-disk). In the future, we could like to store the validity vector as a bitmap in-memory and on-disk, but allowing the user to use an API that uses either a bitmap or bytemap. - A new, internal `ValidityVector` class has been introduced to store the validity vector in-memory. This may seem extraneous because it wraps a simple buffer, but this will change in the future when we support bitmaps. - Similar to the existing "sm.memory_budget" and "sm.memory_budget_var" config parameters, there is now a "sm.memory_budget_validity" for budgeting the validity vector buffers. - Similar to offset tiles, validity tiles have their own compressor that is independent of the user-defined attribute filter. I have tentatively chosen RLE compression. - C/C++ APIs has been added. - The `QueryBuffer` class has been moved from `misc/query_buffer.h` to `query/query_buffer.h` because it now depends on `query/validity_vector`, which is outside of the `misc` directory. - Many of the internal classes are now nullable-aware (`Reader`, `Writer`, `Query`, `FilterPipeline`, `Subarray`, `SubarrayPartitioner`). Co-authored-by: Joe Maley <[email protected]>
- Loading branch information
joe maley
and
Joe Maley
authored
Nov 10, 2020
1 parent
0319f02
commit a7fd8d6
Showing
53 changed files
with
6,291 additions
and
500 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.