Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C API attribute handle #3982

Merged
merged 4 commits into from
Aug 1, 2023
Merged

C API attribute handle #3982

merged 4 commits into from
Aug 1, 2023

Conversation

eric-hughes-tiledb
Copy link
Contributor

Change C API attribute handle to new class tiledb_attribute_handle_t. Move all declarations and definitions of attribute-section API functions into tiledb/api/c_api/attribute/. Handle class is a complete facade around Attribute; C API functions do not access such objects directly. Add exhaustive argument checking in a new attribute handle unit test runner.

Rewrite remaining functions within tiledb.cc that use attributes. Supporting this, reworked a few things in serialization and schema evolution involving attribute arguments to use shared_ptr rather than *.

Removed sanity_check for attributes, replacing it with ensure_attribute_is_valid.

Change class Attribute in some small but textually-significant ways: removed some dead code (including a default constructor) and inlined a number of simple functions. More significantly, remove a non-standard Attribute constructor and replace it with one in standard form. Reworked a number of tests that were relying on this constructor.

Add support in class ArraySchema to return shared_ptr to its internal attributes rather than plain pointers. This simplifies the C API function that return the attributes of a schema.

Create testsupport_capi_datatype.h to hold the INVALID_DATATYPE() function for argument testing.

Fix a defect where tiledb_attribute_alloc was not checking for valid datatype and was able to create invalid attributes. Add null argument checks in most attribute-section functions. All but a few were sources of unrecognized defects; those few were checking for nullptr themselves already.


TYPE: IMPROVEMENT
DESC: C API attribute handle

@eric-hughes-tiledb
Copy link
Contributor Author

Note: This branch is based on eh/capi-handle-dimension rather than dev. #3947 should merge before this one.

@eric-hughes-tiledb eric-hughes-tiledb marked this pull request as draft March 21, 2023 19:04
@eric-hughes-tiledb eric-hughes-tiledb force-pushed the eh/capi-handle-attribute branch 2 times, most recently from b8686f5 to 6144329 Compare March 22, 2023 14:23
@eric-hughes-tiledb
Copy link
Contributor Author

Rebased to current dev

@eric-hughes-tiledb eric-hughes-tiledb force-pushed the eh/capi-handle-attribute branch from 6144329 to 5389f05 Compare March 23, 2023 00:38
@eric-hughes-tiledb eric-hughes-tiledb marked this pull request as ready for review March 24, 2023 13:17
@ihnorton ihnorton requested a review from bekadavis9 March 24, 2023 13:23
@eric-hughes-tiledb
Copy link
Contributor Author

The new functions ArraySchema::shared_attribute (in two signatures) are tested in a subsequent PR #4004. We've had inadequate support for inside-the-API testing, and I chose to leave the tests out of this PR and to combine them with some new support classes.

Copy link
Contributor

@bekadavis9 bekadavis9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New Handle looks great.
Just a couple clarification questions, mostly documentation nits

fill_value_validity_ = attr->fill_value_validity_;
order_ = attr->order_;
Attribute::Attribute(const Attribute& attr) {
name_ = attr.name();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't / can't you use an initializer list here?

Copy link
Contributor Author

@eric-hughes-tiledb eric-hughes-tiledb Apr 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly could. I didn't change it in this PR just to minimize the number of changes and avoid analyzing what the old code had been doing.

The result is that we don't need anything but the default copy constructor.

/**
* Returns a constant pointer to the selected attribute (nullptr if it
* does not exist).
*/
const Attribute* attribute(const std::string& name) const;

/**
* Returns a constant pointer to the selected attribute if found. Returns an
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: shared pointer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

/** It maps each attribute name to the corresponding attribute object.
* Lifespan is maintained by the shared_ptr in attributes_. */
std::unordered_map<std::string, const Attribute*> attribute_map_;
struct attribute_reference {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Add Doxygen comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added one. I was on the fence about whether it was really needed.

* Invariant: The number of entries in `attribute_map_` is the same as the
* number of entries in `attributes_`
*/
std::unordered_map<std::string, attribute_reference> attribute_map_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At face value, I'm not sure I understand the need for this change and/or the addition of attribute_reference

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change supports the addition of shared_attribute by name, which in turns supports the C API function that returns attributes from an array schema. The attribute returned through the API uses a shared_ptr in its handle. Using shared_attribute means that we can copy the pointer rather than copying the attribute object.

We need to retrieve attributes both by name and by index. Using the new structure for the range of the attribute map means we can have one unordered_map structure rather than two of them.

*
* @section DESCRIPTION
*
* This file declares the configuration section of the C API for TileDB.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: attribute

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

tiledb_datatype_t* type) TILEDB_NOEXCEPT;

/**
* Sets the nullability of an attribute.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Retrieves

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

TBH I didn't read the copy-paste from tiledb.h that appears here.

*
* @section DESCRIPTION
*
* This file defines C API functions for the config section.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: attribute

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

)
gather_sources(${SOURCES})

commence(object_library capi_attribute_stub)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a stub because of its dependency on capi_filter_list_stub (or in general, a dependency on a stub)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because of its dependency on capi_filter_list_stub

This. I put the work in to create the StorageManager stub because it was a cyclic dependency at the bottom of a dependency tree. (context depends on it.) One of the consequences of it is that most of the API object modules must be stub modules. When the stub goes away eventually, we'll unwind all of this and make non-stub versions of everything.

@@ -0,0 +1,49 @@
/**
* @file tiledb/api/c_api_test_support/context/testsupport_capi_context.h
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: testsupport_capi_datatype

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*
* @section DESCRIPTION
*
* This file defines test support classes for the context section of the C API.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: datatype

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

@eric-hughes-tiledb eric-hughes-tiledb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the close reading.

)
gather_sources(${SOURCES})

commence(object_library capi_attribute_stub)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because of its dependency on capi_filter_list_stub

This. I put the work in to create the StorageManager stub because it was a cyclic dependency at the bottom of a dependency tree. (context depends on it.) One of the consequences of it is that most of the API object modules must be stub modules. When the stub goes away eventually, we'll unwind all of this and make non-stub versions of everything.

*
* @section DESCRIPTION
*
* This file defines C API functions for the config section.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*
* @section DESCRIPTION
*
* This file declares the configuration section of the C API for TileDB. It
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* @param attr The TileDB attribute to be created.
* @return `TILEDB_OK` for success and `TILEDB_OOM` or `TILEDB_ERR` for error.
*
* @note The default number of values per cell is 1.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a caller is explicitly setting a value, it doesn't matter what the default is, although the code author might not know.

The default is certainly relevant here, where the API object is first created.

tiledb_filter_list_t* filter_list) TILEDB_NOEXCEPT;

/**
* Sets the number of values per cell for an attribute. If this is not
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's OK to repeat relevant facts in function documentation. Otherwise it moves toward the kind of documentation where programmers have to read everything to get anything done.

@@ -0,0 +1,49 @@
/**
* @file tiledb/api/c_api_test_support/context/testsupport_capi_context.h
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

/**
* Returns a constant pointer to the selected attribute (nullptr if it
* does not exist).
*/
const Attribute* attribute(const std::string& name) const;

/**
* Returns a constant pointer to the selected attribute if found. Returns an
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

/** It maps each attribute name to the corresponding attribute object.
* Lifespan is maintained by the shared_ptr in attributes_. */
std::unordered_map<std::string, const Attribute*> attribute_map_;
struct attribute_reference {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added one. I was on the fence about whether it was really needed.

* Invariant: The number of entries in `attribute_map_` is the same as the
* number of entries in `attributes_`
*/
std::unordered_map<std::string, attribute_reference> attribute_map_;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change supports the addition of shared_attribute by name, which in turns supports the C API function that returns attributes from an array schema. The attribute returned through the API uses a shared_ptr in its handle. Using shared_attribute means that we can copy the pointer rather than copying the attribute object.

We need to retrieve attributes both by name and by index. Using the new structure for the range of the attribute map means we can have one unordered_map structure rather than two of them.

fill_value_validity_ = attr->fill_value_validity_;
order_ = attr->order_;
Attribute::Attribute(const Attribute& attr) {
name_ = attr.name();
Copy link
Contributor Author

@eric-hughes-tiledb eric-hughes-tiledb Apr 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly could. I didn't change it in this PR just to minimize the number of changes and avoid analyzing what the old code had been doing.

The result is that we don't need anything but the default copy constructor.

Copy link
Contributor

@bekadavis9 bekadavis9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eric-hughes-tiledb eric-hughes-tiledb force-pushed the eh/capi-handle-attribute branch from 81cbd63 to 76e352a Compare July 24, 2023 16:37
@eric-hughes-tiledb eric-hughes-tiledb force-pushed the eh/capi-handle-attribute branch from 76e352a to 81ccdf7 Compare July 25, 2023 11:20
@eric-hughes-tiledb eric-hughes-tiledb force-pushed the eh/capi-handle-attribute branch from 81ccdf7 to 7ac0d66 Compare July 31, 2023 16:31
Add new experimental attribute header to C API header list.
@ihnorton ihnorton merged commit b146b1d into dev Aug 1, 2023
@ihnorton ihnorton deleted the eh/capi-handle-attribute branch August 1, 2023 03:32
davisp pushed a commit to davisp/TileDB that referenced this pull request Sep 6, 2023
Change C API attribute handle to new `class tiledb_attribute_handle_t`. Move all declarations and definitions of attribute-section API functions into `tiledb/api/c_api/attribute/`. Handle class is a complete facade around `Attribute`; C API functions do not access such objects directly. Add exhaustive argument checking in a new attribute handle unit test runner.

Rewrite remaining functions within `tiledb.cc` that use attributes. Supporting this, reworked a few things in serialization and schema evolution involving attribute arguments to use `shared_ptr` rather than `*`.

Removed `sanity_check` for attributes, replacing it with `ensure_attribute_is_valid`.

Change `class Attribute` in some small but textually-significant ways: removed some dead code (including a default constructor) and inlined a number of simple functions. More significantly, remove a non-standard `Attribute` constructor and replace it with one in standard form. Reworked a number of tests that were relying on this constructor.

Add support in `class ArraySchema` to return `shared_ptr` to its internal attributes rather than plain pointers. This simplifies the C API function that return the attributes of a schema.

Create `testsupport_capi_datatype.h` to hold the `INVALID_DATATYPE()` function for argument testing.

Fix a defect where `tiledb_attribute_alloc` was not checking for valid datatype and was able to create invalid attributes. Add null argument checks in most attribute-section functions. All but a few were sources of unrecognized defects; those few were checking for `nullptr` themselves already.

---
TYPE: IMPROVEMENT
DESC: C API attribute handle
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants