Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add best practices section, prepare 3.0.0 release #32

Merged
merged 11 commits into from
Jan 19, 2025
46 changes: 46 additions & 0 deletions source/best_practices.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
When writing schema using the HDMF Schema Language, including extensions, the HDMF development team provides a few best
practices to ensure correct behavior from the HDMF reference API and other APIs that we are aware of (e.g., MatNWB).

1. Do not create schema that the HDMF reference API does not yet support. See `hdmf_support`_ for details.

2. Define new data types (``data_type_def``) at the root of the schema rather than nested within another data type
definition. Nested type definitions may in some cases lead to errors in HDMF. See `hdmf#511`_ and `hdmf#73`_.

3. Use the ``quantity`` key not in the data type definition but in the group/dataset spec where the type is included.
When the data type is included within another data type via ``data_type_inc``, if the ``quantity`` key is omitted, the
default value of 1 would be used. This makes the ``quantity`` defined in the data type definition meaningless
and confusing.

4. Use the ``name`` key not in the data type definition but in the group/dataset spec where the type is included,
unless you really want to require that all instances of the data type have that name. Mismatch between the name
defined on the data type definition and where it is included can lead to unexpected behavior in the APIs.

5. Create a new data type when adding attributes/datasets/groups/links to an existing data type. See
`hdmf-schema-language#13`_ for details. Adding attributes/datasets/groups/links to an existing data type using
``data_type_inc`` is partially supported by the APIs (for example, the validator may not check these added fields),
so this is discouraged until full, tested support is added.

6. Modifying the dtype, shape, or quantity of a data type when using ``data_type_inc`` should only restrict the values
from their original definitions. This ensures that the data types follow the object-oriented programming principle of
inheritance. For example, if type A has ``dtype: text`` and type B extends type A
(``data_type_def: B, data_type_inc: A``), then type B should not redefine ``dtype`` to be ``int``
which is incompatible with the ``dtype`` of type A. The same idea holds if type A is included in another type
and a new type is not defined (just ``data_type_inc: A``).
In other words, all children types should be valid against the parent type. See `hdmf#321`_.

7. The use of list values for the ``value`` and ``default_value`` keys, e.g., ``value: [0, 1, 2]`` is not fully
supported in the official APIs, so this is discouraged until full, tested support is added.

8. The names of data types or objects should use only characters in the sets ``a-z``, ``A-Z``, ``0-9``, ``-``, ``_``,
``.``. This helps ensure consistent behavior in the APIs across different storage backends and operating systems.
For example, writing a group that contains ":" to a Zarr backend on a Windows machine is not allowed by Windows.
See `pynwb#1421`_ and `hdmf-zarr#219`_ for examples.


.. _hdmf#511: https://github.com/hdmf-dev/hdmf/issues/511
.. _hdmf#73: https://github.com/hdmf-dev/hdmf/issues/73
.. _hdmf-schema-language#13: https://github.com/hdmf-dev/hdmf-schema-language/issues/13
.. _hdmf#321: https://github.com/hdmf-dev/hdmf/issues/321
.. _pynwb#1421: https://github.com/NeurodataWithoutBorders/pynwb/issues/1421
.. _hdmf-zarr#219: https://github.com/hdmf-dev/hdmf-zarr/issues/219
.. _hdmf_support: https://hdmf.readthedocs.io/en/stable/spec_language_support.html
1 change: 1 addition & 0 deletions source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,6 @@ See the full language description, release notes, and credits below.
:caption: Table of Contents

description
best_practices
release_notes
credits
3 changes: 2 additions & 1 deletion source/release_notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Release Notes
=============

Version 3.0.0 (Upcoming)
Version 3.0.0 (January 21, 2025)
---------------------------------
* Deprecated the ``default_value`` key for datasets. This key is still supported for attributes.
* Deprecated ``linkable`` key for groups and datasets.
Expand All @@ -14,6 +14,7 @@ Version 3.0.0 (Upcoming)
* Updated ``datetime`` specification to allow a date with no time or timezone.
* Changed the meaning of the default shape ``shape: null`` from representing a scalar to representing any shape.
* Added special value for ``shape: scalar`` that represents a scalar.
* Added best practices section to documentation.
* Changed the meaning of ``dtype: int`` from ``int32`` to ``int8`` to allow for arbitrary
precision of numeric data when the minimum precision is not specified. Added
``dtype: uint`` which means uint8.
Expand Down