Skip to content

Commit

Permalink
Dataset inspection functionality (#291)
Browse files Browse the repository at this point in the history
Add dataset inspection capabilities, including the ability to:
1. Get a list of metadata field names
2. Get the type of a metadata field
3. Get a list of tensor names
4. Get the type of a tensor

[ committed by @billschereriii ]
[ reviewed by @MattToast @mellis13 @ashao ]
  • Loading branch information
billschereriii authored Jan 24, 2023
1 parent 3be28e5 commit 36c6c26
Show file tree
Hide file tree
Showing 20 changed files with 891 additions and 29 deletions.
3 changes: 3 additions & 0 deletions doc/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ This section details changes made in the development branch that have not yet be

Description

- Add support for inspection of tensors and metadata inside datasets
- Add support for user-directed logging for Python clients, using Client, Dataset, or LogContext logging methods
- Add support for user-directed logging for C and Fortran clients without a Client or Dataset context
- Additional error reporting for connections to and commands run against Redis databases
Expand All @@ -25,6 +26,7 @@ Description

Detailed Notes

- Added support for retrieval of names and types of tensors and metadata inside datasets (PR291_)
- Added support for user-directed logging for Python clients via {Client, Dataset, LogContext}.{log_data, log_warning, log_error} methods (PR289_)
- Added support for user-directed logging without a Client or Dataset context to C and Fortran clients via _string() methods (PR288_)
- Added logging to capture transient errors that arise in the _run() and _connect() methods of the Redis and RedisCluster classes (PR287_)
Expand All @@ -40,6 +42,7 @@ Detailed Notes
- Implemented support for Unix Domain Sockets, including refactorization of server address code, test cases, and check-in tests. (PR252_)
- A new make target `make lib-with-fortran` now compiles the Fortran client and dataset into its own library which applications can link against (PR245_)

.. _PR288: https://github.com/CrayLabs/SmartRedis/pull/291
.. _PR288: https://github.com/CrayLabs/SmartRedis/pull/289
.. _PR288: https://github.com/CrayLabs/SmartRedis/pull/288
.. _PR287: https://github.com/CrayLabs/SmartRedis/pull/287
Expand Down
46 changes: 46 additions & 0 deletions include/c_dataset.h
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,52 @@ SRError get_meta_strings(void* dataset,
size_t* n_strings,
size_t** lengths);

/*!
* \brief Retrieve the names of tensors in the DataSet
* \param dataset The dataset to use for this operation
* \param data Receives an array of tensor names
* \param n_strings Receives the number of strings returned in \p data
* \param lengths Receives an array containing the lengths of the strings
* returned in \p data
* \return Returns SRNoError on success or an error code on failure
*/
SRError get_tensor_names(
void* dataset, char*** data, size_t* n_strings, size_t** lengths);

/*!
* \brief Retrieve the data type of a Tensor in the DataSet
* \param dataset The dataset to use for this operation
* \param name The name of the tensor (null-terminated string)
* \param name_len The length in bytes of the tensor name
* \param ttype Receives the type for the specified tensor
* \return Returns SRNoError on success or an error code on failure
*/
SRError get_tensor_type(
void* dataset, const char* name, size_t name_len, SRTensorType* ttype);

/*!
* \brief Retrieve the names of all metadata fields in the DataSet
* \param dataset The dataset to use for this operation
* \param data Receives an array of metadata field names
* \param n_strings Receives the number of strings returned in \p data
* \param lengths Receives an array containing the lengths of the strings
* returned in \p data
* \return Returns SRNoError on success or an error code on failure
*/
SRError get_metadata_field_names(
void* dataset, char*** data, size_t* n_strings, size_t** lengths);

/*!
* \brief Retrieve the data type of a metadata field in the DataSet
* \param dataset The dataset to use for this operation
* \param name The name of the metadata field (null-terminated string)
* \param name_len The length in bytes of the metadata field name
* \param mdtype Receives the type for the specified metadata field
* \return Returns SRNoError on success or an error code on failure
*/
SRError get_metadata_field_type(
void* dataset, const char* name, size_t name_len, SRMetaDataType* mdtype);

#ifdef __cplusplus
}
#endif
Expand Down
73 changes: 60 additions & 13 deletions include/dataset.h
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,23 @@ class DataSet : public SRObject
*/
void clear_field(const std::string& field_name);

/*!
* \brief Retrieve the name of the DataSet
* \returns The name of the DataSet
*/
std::string get_name() const { return _dsname; }

/*!
* \brief Change the name for the DataSet
* \param name The name for the DataSet
*/
void set_name(std::string name) {
if (name.length() > 0)
_dsname = name;
else
throw SRParameterException("Name must be non-zero length");
}

/*!
* \brief Retrieve the names of tensors in the DataSet
* \returns The name of the tensors in the DataSet
Expand All @@ -283,16 +300,53 @@ class DataSet : public SRObject
std::vector<std::string> get_tensor_names();

/*!
* \brief Retrieve the name of the DataSet
* \returns The name of the DataSet
* \brief Retrieve tensor names from the DataSet.
* \details The memory of the data pointer is valid until the
* DataSet is destroyed.
* \param data Receives an array of tensor names
* \param n_strings Receives the number of tensor names
* \param lengths Receives an array of the lengths of the tensor names
* \throw SmartRedis::Exception if tensor name retrieval fails
*/
std::string get_name() const { return _dsname; }
void get_tensor_names(char**& data,
size_t& n_strings,
size_t*& lengths);

/*!
* \brief Change the name for the DataSet
* \param name The name for the DataSet
* \brief Retrieve the data type of a Tensor in the DataSet
* \param name The name of the tensor
* \returns The data type for the tensor
* \throw SmartRedis::Exception if tensor name retrieval fails
*/
SRTensorType get_tensor_type(const std::string& name);

/*!
* \brief Retrieve the names of all metadata fields in the DataSet
* \returns A vector of metadata field names
*/
std::vector<std::string> get_metadata_field_names();

/*!
* \brief Retrieve metadata field names from the DataSet.
* \details The memory of the data pointer is valid until the
* DataSet is destroyed.
* \param data Receives an array of metadata field names
* \param n_strings Receives the number of metadata field names
* \param lengths Receives an array of the lengths of the metadata
* field names
* \throw SmartRedis::Exception if metadata field name retrieval fails
*/
void set_name(std::string name) { _dsname = name; }
void get_metadata_field_names(char**& data,
size_t& n_strings,
size_t*& lengths);

/*!
* \brief Retrieve the data type of a metadata field in the DataSet
* \param name The name of the metadata field
* \returns The data type for the metadata field
* \throw SmartRedis::Exception if metadata field name retrieval fails
*/
SRMetaDataType get_metadata_field_type(const std::string& name);

friend class Client;
friend class PyDataset;
Expand Down Expand Up @@ -335,13 +389,6 @@ class DataSet : public SRObject
*/
const_tensor_iterator tensor_cend();

/*!
* \brief Retrieve the data type of a Tensor in the DataSet
* \param name The name of the tensor
* \returns The data type for the tensor
*/
SRTensorType get_tensor_type(const std::string& name);

/*!
* \brief Returns a vector of std::pair with
* the field name and the field serialization
Expand Down
37 changes: 37 additions & 0 deletions include/metadata.h
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,44 @@ class MetaData
std::vector<std::pair<std::string, std::string>>
get_metadata_serialization_map();

/*!
* \brief Retrieve the type of a metadata field
* \param name The name of the field to check
* \throw KeyException if the name is not present
*/
SRMetaDataType get_field_type(const std::string& name);

/*!
* \brief Retrieve a vector of metadata field names
* \param skip_internal Omit internal items (such as .tensor_names)
* from the results
*/
std::vector<std::string> get_field_names(bool skip_internal = false);

/*!
* \brief Get metadata field names using a c-style
* interface
* \details This function allocates memory to
* return a pointer (via pointer reference "data")
* to the user and sets the value of n_strings to
* the number of strings in the field. Memory is also
* allocated to store the length of each string in the
* field, and the provided lengths pointer is pointed
* to this new memory. The memory for the strings and
* string lengths is valid until the MetaData object is
* destroyed.
* \param data A c-ptr pointed to newly allocated memory
* for the names
* \param n_strings The number of names returned
* \param lengths A size_t pointer pointed to newly allocated
* memory that stores the length of each string
* \param skip_internal Omit internal items (such as .tensor_names)
* from the results
*/
void get_field_names(char**& data,
size_t& n_strings,
size_t*& lengths,
bool skip_internal = false);
private:

/*!
Expand Down
26 changes: 26 additions & 0 deletions include/pydataset.h
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,32 @@ class PyDataset : public PySRObject
*/
py::list get_meta_strings(const std::string& name);

/*!
* \brief Retrieve the names of all tensors in the DataSet
* \returns A vector of tensor names
*/
py::list get_tensor_names();

/*!
* \brief Retrieve the data type of a Tensor in the DataSet
* \param name The name of the tensor
* \returns The data type for the tensor
*/
std::string get_tensor_type(const std::string& name);

/*!
* \brief Retrieve the names of all metadata fields in the DataSet
* \returns A vector of metadata field names
*/
py::list get_metadata_field_names();

/*!
* \brief Retrieve the data type of a metadata field in the DataSet
* \param name The name of the metadata field
* \returns The data type for the metadata field
*/
std::string get_metadata_field_type(const std::string& name);

/*!
* \brief Get the name of the PyDataset
* \returns std::string of the PyDataset name
Expand Down
110 changes: 110 additions & 0 deletions src/c/c_dataset.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -337,3 +337,113 @@ SRError get_meta_strings(void* dataset,

return result;
}

// Retrieve the names of tensors in the DataSet
extern "C"
SRError get_tensor_names(
void* dataset, char*** data, size_t* n_strings, size_t** lengths)
{
SRError result = SRNoError;
try
{
// Sanity check params
SR_CHECK_PARAMS(dataset != NULL && data != NULL &&
n_strings != NULL && lengths != NULL);

DataSet* d = reinterpret_cast<DataSet*>(dataset);
d->get_tensor_names(*data, *n_strings, *lengths);
}
catch (const Exception& e) {
SRSetLastError(e);
result = e.to_error_code();
}
catch (...) {
SRSetLastError(SRInternalException("Unknown exception occurred"));
result = SRInternalError;
}

return result;
}

// Retrieve the data type of a Tensor in the DataSet
extern "C"
SRError get_tensor_type(
void* dataset, const char* name, size_t name_len, SRTensorType* ttype)
{
SRError result = SRNoError;
try
{
// Sanity check params
SR_CHECK_PARAMS(dataset != NULL && ttype != NULL);

DataSet* d = reinterpret_cast<DataSet*>(dataset);
std::string tensor_name(name, name_len);
SRTensorType result = d->get_tensor_type(tensor_name);
*ttype = result;
}
catch (const Exception& e) {
SRSetLastError(e);
result = e.to_error_code();
}
catch (...) {
SRSetLastError(SRInternalException("Unknown exception occurred"));
result = SRInternalError;
}

return result;
}

// Retrieve the names of all metadata fields in the DataSet
extern "C"
SRError get_metadata_field_names(
void* dataset, char*** data, size_t* n_strings, size_t** lengths)
{
SRError result = SRNoError;
try
{
// Sanity check params
SR_CHECK_PARAMS(dataset != NULL && data != NULL &&
n_strings != NULL && lengths != NULL);

DataSet* d = reinterpret_cast<DataSet*>(dataset);
d->get_metadata_field_names(*data, *n_strings, *lengths);
}
catch (const Exception& e) {
SRSetLastError(e);
result = e.to_error_code();
}
catch (...) {
SRSetLastError(SRInternalException("Unknown exception occurred"));
result = SRInternalError;
}

return result;
}

// Retrieve the data type of a metadata field in the DataSet
extern "C"
SRError get_metadata_field_type(
void* dataset, const char* name, size_t name_len, SRMetaDataType* mdtype)
{
SRError result = SRNoError;
try
{
// Sanity check params
SR_CHECK_PARAMS(dataset != NULL && mdtype != NULL);

DataSet* d = reinterpret_cast<DataSet*>(dataset);
std::string mdf_name(name, name_len);
SRMetaDataType result = d->get_metadata_field_type(mdf_name);
*mdtype = result;
}
catch (const Exception& e) {
SRSetLastError(e);
result = e.to_error_code();
}
catch (...) {
SRSetLastError(SRInternalException("Unknown exception occurred"));
result = SRInternalError;
}

return result;
}
Loading

0 comments on commit 36c6c26

Please sign in to comment.