Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent use of 'Zarr format 2 or 3' #2645

Merged
merged 6 commits into from
Jan 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Zarr-Python

Zarr-Python is a Python library for reading and writing Zarr groups and arrays. Highlights include:

* Specification support for both Zarr v2 and v3.
* Specification support for both Zarr format 2 and 3.
* Create and read from N-dimensional arrays using NumPy-like semantics.
* Flexible storage enables reading and writing from local, cloud and in-memory stores.
* High performance: Enables fast I/O with support for asynchronous I/O and multi-threading.
Expand Down
6 changes: 3 additions & 3 deletions docs/user-guide/extending.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ Custom codecs
-------------

.. note::
This section explains how custom codecs can be created for Zarr version 3 data. For Zarr
version 2, codecs should subclass the
This section explains how custom codecs can be created for Zarr format 3 arrays. For Zarr
format 2, codecs should subclass the
`numcodecs.abc.Codec <https://numcodecs.readthedocs.io/en/stable/abc.html#numcodecs.abc.Codec>`_
base class and register through
`numcodecs.registry.register_codec <https://numcodecs.readthedocs.io/en/stable/registry.html#numcodecs.registry.register_codec>`_.
Expand Down Expand Up @@ -66,7 +66,7 @@ strongly recommended to prefix the codec identifier with a unique name. For exam
the codecs from ``numcodecs`` are prefixed with ``numcodecs.``, e.g. ``numcodecs.delta``.

.. note::
Note that the extension mechanism for the Zarr version 3 is still under development.
Note that the extension mechanism for the Zarr format 3 is still under development.
Requirements for custom codecs including the choice of codec identifiers might
change in the future.

Expand Down
2 changes: 1 addition & 1 deletion docs/user-guide/v3_migration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
Zarr-Python 3 represents a major refactor of the Zarr-Python codebase. Some of the
goals motivating this refactor included:

* adding support for the Zarr V3 specification (along with the Zarr V2 specification)
* adding support for the Zarr format 3 specification (along with the Zarr format 2 specification)
* cleaning up internal and user facing APIs
* improving performance (particularly in high latency storage environments like
cloud object stores)
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -380,7 +380,7 @@ filterwarnings = [
"ignore:The loop argument is deprecated since Python 3.8.*:DeprecationWarning",
"ignore:Creating a zarr.buffer.gpu.*:UserWarning",
"ignore:Duplicate name:UserWarning", # from ZipFile
"ignore:.*is currently not part in the Zarr version 3 specification.*:UserWarning",
"ignore:.*is currently not part in the Zarr format 3 specification.*:UserWarning",
]
markers = [
"gpu: mark a test as requiring CuPy and GPU"
Expand Down
22 changes: 11 additions & 11 deletions src/zarr/api/asynchronous.py
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@ async def consolidate_metadata(

if any(m.zarr_format == 3 for m in members_metadata.values()):
warnings.warn(
"Consolidated metadata is currently not part in the Zarr version 3 specification. It "
"Consolidated metadata is currently not part in the Zarr format 3 specification. It "
"may not be supported by other zarr implementations and may change in the future.",
category=UserWarning,
stacklevel=1,
Expand Down Expand Up @@ -770,16 +770,16 @@ async def open_group(
Whether to use consolidated metadata.

By default, consolidated metadata is used if it's present in the
store (in the ``zarr.json`` for Zarr v3 and in the ``.zmetadata`` file
for Zarr v2).
store (in the ``zarr.json`` for Zarr format 3 and in the ``.zmetadata`` file
for Zarr format 2).

To explicitly require consolidated metadata, set ``use_consolidated=True``,
which will raise an exception if consolidated metadata is not found.

To explicitly *not* use consolidated metadata, set ``use_consolidated=False``,
which will fall back to using the regular, non consolidated metadata.

Zarr v2 allowed configuring the key storing the consolidated metadata
Zarr format 2 allowed configuring the key storing the consolidated metadata
(``.zmetadata`` by default). Specify the custom key as ``use_consolidated``
to load consolidated metadata from a non-default key.

Expand Down Expand Up @@ -870,21 +870,21 @@ async def create(
Array shape.
chunks : int or tuple of ints, optional
The shape of the array's chunks.
V2 only. V3 arrays should use `chunk_shape` instead.
Zarr format 2 only. Zarr format 3 arrays should use `chunk_shape` instead.
If not specified, default values are guessed based on the shape and dtype.
dtype : str or dtype, optional
NumPy dtype.
chunk_shape : int or tuple of ints, optional
The shape of the Array's chunks (default is None).
V3 only. V2 arrays should use `chunks` instead.
Zarr format 3 only. Zarr format 2 arrays should use `chunks` instead.
chunk_key_encoding : ChunkKeyEncoding, optional
A specification of how the chunk keys are represented in storage.
V3 only. V2 arrays should use `dimension_separator` instead.
Zarr format 3 only. Zarr format 2 arrays should use `dimension_separator` instead.
Default is ``("default", "/")``.
codecs : Sequence of Codecs or dicts, optional
An iterable of Codec or dict serializations of Codecs. The elements of
this collection specify the transformation from array values to stored bytes.
V3 only. V2 arrays should use ``filters`` and ``compressor`` instead.
Zarr format 3 only. Zarr format 2 arrays should use ``filters`` and ``compressor`` instead.

If no codecs are provided, default codecs will be used:

Expand All @@ -895,7 +895,7 @@ async def create(
These defaults can be changed by modifying the value of ``array.v3_default_codecs`` in :mod:`zarr.core.config`.
compressor : Codec, optional
Primary compressor to compress chunk data.
V2 only. V3 arrays should use ``codecs`` instead.
Zarr format 2 only. Zarr format 3 arrays should use ``codecs`` instead.

If neither ``compressor`` nor ``filters`` are provided, a default compressor will be used:

Expand Down Expand Up @@ -925,7 +925,7 @@ async def create(
for storage of both chunks and metadata.
filters : sequence of Codecs, optional
Sequence of filters to use to encode chunk data prior to compression.
V2 only. If no ``filters`` are provided, a default set of filters will be used.
Zarr format 2 only. If no ``filters`` are provided, a default set of filters will be used.
These defaults can be changed by modifying the value of ``array.v2_default_filters`` in :mod:`zarr.core.config`.
cache_metadata : bool, optional
If True, array configuration metadata will be cached for the
Expand All @@ -942,7 +942,7 @@ async def create(
A codec to encode object arrays, only needed if dtype=object.
dimension_separator : {'.', '/'}, optional
Separator placed between the dimensions of a chunk.
V2 only. V3 arrays should use ``chunk_key_encoding`` instead.
Zarr format 2 only. Zarr format 3 arrays should use ``chunk_key_encoding`` instead.
Default is ".".
write_empty_chunks : bool, optional
Deprecated in favor of the ``config`` keyword argument.
Expand Down
38 changes: 19 additions & 19 deletions src/zarr/api/synchronous.py
Original file line number Diff line number Diff line change
Expand Up @@ -502,16 +502,16 @@ def open_group(
Whether to use consolidated metadata.

By default, consolidated metadata is used if it's present in the
store (in the ``zarr.json`` for Zarr v3 and in the ``.zmetadata`` file
for Zarr v2).
store (in the ``zarr.json`` for Zarr format 3 and in the ``.zmetadata`` file
for Zarr format 2).

To explicitly require consolidated metadata, set ``use_consolidated=True``,
which will raise an exception if consolidated metadata is not found.

To explicitly *not* use consolidated metadata, set ``use_consolidated=False``,
which will fall back to using the regular, non consolidated metadata.

Zarr v2 allowed configuring the key storing the consolidated metadata
Zarr format 2 allows configuring the key storing the consolidated metadata
(``.zmetadata`` by default). Specify the custom key as ``use_consolidated``
to load consolidated metadata from a non-default key.

Expand Down Expand Up @@ -785,16 +785,16 @@ def create_array(
Iterable of filters to apply to each chunk of the array, in order, before serializing that
chunk to bytes.

For Zarr v3, a "filter" is a codec that takes an array and returns an array,
For Zarr format 3, a "filter" is a codec that takes an array and returns an array,
and these values must be instances of ``ArrayArrayCodec``, or dict representations
of ``ArrayArrayCodec``.
If ``filters`` and ``compressors`` are not specified, then the default codecs for
Zarr v3 will be used.
Zarr format 3 will be used.
These defaults can be changed by modifying the value of ``array.v3_default_codecs``
in :mod:`zarr.core.config`.
Use ``None`` to omit default filters.

For Zarr v2, a "filter" can be any numcodecs codec; you should ensure that the
For Zarr format 2, a "filter" can be any numcodecs codec; you should ensure that the
the order if your filters is consistent with the behavior of each filter.
If no ``filters`` are provided, a default set of filters will be used.
These defaults can be changed by modifying the value of ``array.v2_default_filters``
Expand All @@ -804,32 +804,32 @@ def create_array(
List of compressors to apply to the array. Compressors are applied in order, and after any
filters are applied (if any are specified).

For Zarr v3, a "compressor" is a codec that takes a bytestrea, and
returns another bytestream. Multiple compressors my be provided for Zarr v3.
For Zarr format 3, a "compressor" is a codec that takes a bytestream, and
returns another bytestream. Multiple compressors my be provided for Zarr format 3.
If ``filters`` and ``compressors`` are not specified, then the default codecs for
Zarr v3 will be used.
Zarr format 3 will be used.
These defaults can be changed by modifying the value of ``array.v3_default_codecs``
in :mod:`zarr.core.config`.
Use ``None`` to omit default compressors.

For Zarr v2, a "compressor" can be any numcodecs codec. Only a single compressor may
be provided for Zarr v2.
For Zarr format 2, a "compressor" can be any numcodecs codec. Only a single compressor may
be provided for Zarr format 2.
If no ``compressors`` are provided, a default compressor will be used.
These defaults can be changed by modifying the value of ``array.v2_default_compressor``
in :mod:`zarr.core.config`.
Use ``None`` to omit the default compressor.
serializer : dict[str, JSON] | ArrayBytesCodec, optional
Array-to-bytes codec to use for encoding the array data.
Zarr v3 only. Zarr v2 arrays use implicit array-to-bytes conversion.
Zarr format 3 only. Zarr format 2 arrays use implicit array-to-bytes conversion.
If no ``serializer`` is provided, the `zarr.codecs.BytesCodec` codec will be used.
fill_value : Any, optional
Fill value for the array.
order : {"C", "F"}, optional
The memory of the array (default is "C").
For Zarr v2, this parameter sets the memory order of the array.
For Zarr v3, this parameter is deprecated, because memory order
is a runtime parameter for Zarr v3 arrays. The recommended way to specify the memory
order for Zarr v3 arrays is via the ``config`` parameter, e.g. ``{'config': 'C'}``.
For Zarr format 2, this parameter sets the memory order of the array.
For Zarr format 3, this parameter is deprecated, because memory order
is a runtime parameter for Zarr format 3 arrays. The recommended way to specify the memory
order for Zarr format 3 arrays is via the ``config`` parameter, e.g. ``{'config': 'C'}``.
If no ``order`` is provided, a default order will be used.
This default can be changed by modifying the value of ``array.order`` in :mod:`zarr.core.config`.
zarr_format : {2, 3}, optional
Expand All @@ -838,11 +838,11 @@ def create_array(
Attributes for the array.
chunk_key_encoding : ChunkKeyEncoding, optional
A specification of how the chunk keys are represented in storage.
For Zarr v3, the default is ``{"name": "default", "separator": "/"}}``.
For Zarr v2, the default is ``{"name": "v2", "separator": "."}}``.
For Zarr format 3, the default is ``{"name": "default", "separator": "/"}}``.
For Zarr format 2, the default is ``{"name": "v2", "separator": "."}}``.
dimension_names : Iterable[str], optional
The names of the dimensions (default is None).
Zarr v3 only. Zarr v2 arrays should not use this parameter.
Zarr format 3 only. Zarr format 2 arrays should not use this parameter.
storage_options : dict, optional
If using an fsspec URL to create the store, these will be passed to the backend implementation.
Ignored otherwise.
Expand Down
4 changes: 2 additions & 2 deletions src/zarr/codecs/vlen_utf8.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
class VLenUTF8Codec(ArrayBytesCodec):
def __init__(self) -> None:
warn(
"The codec `vlen-utf8` is currently not part in the Zarr version 3 specification. It "
"The codec `vlen-utf8` is currently not part in the Zarr format 3 specification. It "
"may not be supported by other zarr implementations and may change in the future.",
category=UserWarning,
stacklevel=2,
Expand Down Expand Up @@ -83,7 +83,7 @@ def compute_encoded_size(self, input_byte_length: int, _chunk_spec: ArraySpec) -
class VLenBytesCodec(ArrayBytesCodec):
def __init__(self) -> None:
warn(
"The codec `vlen-bytes` is currently not part in the Zarr version 3 specification. It "
"The codec `vlen-bytes` is currently not part in the Zarr format 3 specification. It "
"may not be supported by other zarr implementations and may change in the future.",
category=UserWarning,
stacklevel=2,
Expand Down
Loading
Loading