Merge branch 'main' into doc/omero

ome · Jul 6, 2023 · bb69bc5 · bb69bc5
2 parents ef4e605 + b7359aa
commit bb69bc5
Show file tree

Hide file tree

Showing 12 changed files with 194 additions and 86 deletions.
diff --git a/.github/workflows/review.yml b/.github/workflows/review.yml
@@ -21,6 +21,7 @@ jobs:
           issue-number: ${{ github.event.pull_request.number }}
           body: |
             #### Automated Review URLs
+            * [Readthedocs](https://ngff--${{ github.event.pull_request.number }}.org.readthedocs.build/)
             * [render latest/index.bs](http://api.csswg.org/bikeshed/?url=https://raw.githubusercontent.com/ome/ngff/${{ github.event.pull_request.head.sha }}/latest/index.bs)
             * [diff latest modified](https://services.w3.org/htmldiff?doc1=https%3A%2F%2Fngff.openmicroscopy.org%2Flatest%2F&doc2=http%3A%2F%2Fapi.csswg.org%2Fbikeshed%2F%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fome%2Fngff%2F${{  github.event.pull_request.head.sha }}%2Flatest%2Findex.bs)
           edit-mode: replace
diff --git a/README.md b/README.md
@@ -1,3 +1,5 @@
+[![DOI](https://zenodo.org/badge/313652456.svg)](https://zenodo.org/badge/latestdoi/313652456)
+
 # ome-ngff
 
 [Next-generation file format (NGFF) specifications](https://ngff.openmicroscopy.org/latest/) for storing bioimaging data in the cloud.

diff --git a/about/index.md b/about/index.md
@@ -1,6 +1,56 @@
 About
 =====
 
+Bioimaging science is at a crossroads. Currently, the drive to acquire more,
+larger, preciser spatial measurements is unfortunately at odds with our ability
+to structure and share those measurements with others. During a global pandemic
+more than ever, we believe fervently that global, collaborative discovery as
+opposed to the post-publication, "data-on-request" mode of operation is the
+path forward. Bioimaging data should be shareable via open and commercial cloud
+resources without the need to download entire datasets.
+
+At the moment, that is not the norm. The plethora of data formats produced by
+imaging systems are ill-suited to remote sharing. Individual scientists
+typically lack the infrastructure they need to host these data themselves. When
+they acquire images from elsewhere, time-consuming translations and data
+cleaning are needed to interpret findings. Those same costs are multiplied when
+gathering data into online repositories where curator time can be the limiting
+factor before publication is possible. Without a common effort, each lab or
+resource is left building the tools they need and maintaining that
+infrastructure often without dedicated funding.
+
+This document defines a specification for bioimaging data to make it possible
+to enable the conversion of proprietary formats into a common, cloud-ready one.
+Such next-generation file formats layout data so that individual portions, or
+"chunks", of large data are reference-able eliminating the need to download
+entire datasets.
+
+
+Why "NGFF"?
+-----------
+
+A short description of what is needed for an imaging format is "a hierarchy
+of n-dimensional (dense) arrays with metadata". This combination of features
+is certainly provided by `HDF5`
+from the [HDF Group](https://www.hdfgroup.org), which a number of
+bioimaging formats do use. HDF5 and other larger binary structures, however,
+are ill-suited for storage in the cloud where accessing individual chunks
+of data by name rather than seeking through a large file is at the heart of
+parallelization.
+
+As a result, a number of formats have been developed more recently which provide
+the basic data structure of an HDF5 file, but do so in a more cloud-friendly way.
+In the [PyData](https://pydata.org/) community, the Zarr [[zarr]] format was developed
+for easily storing collections of [NumPy](https://numpy.org/) arrays. In the
+[ImageJ](https://imagej.net/) community, N5 [[n5]] was developed to work around
+the limitations of HDF5 ("N5" was originally short for "Not-HDF5").
+Both of these formats permit storing individual chunks of data either locally in
+separate files or in cloud-based object stores as separate keys.
+
+An [updated Zarr version (v3)](https://zarr-specs.readthedocs.io/)
+is underway to unify the two similar specifications to provide a single binary
+specification. See this [blog post](https://zarr.dev/blog/zep1-update/) for more information.
+
 In addition to the next-generation file format (NGFF) [specifications](../specifications/index.md),
 the pages listed below are intended to provide an overview of external resources available
 for working with NGFF data.
@@ -12,7 +62,7 @@ The following pages are intended to provide an overview of the available resourc
 * [Publications](../publications/index.md): List of publications referencing OME-NGFF or publishing
   datasets in OME-Zarr.
 
-Additionally, notes and recordings of the passt NGFF community calls are available:
+Additionally, notes and recordings of the past NGFF community calls are available:
 
 | Call | Date | Presenters | Forum thread | Notes |
 |------|------|------------|--------------|-------|

diff --git a/data/index.md b/data/index.md
@@ -1,19 +1,20 @@
 Data Resources
 ==============
 
-| Catalog                                                                  | Descriptions                                                                                                        | Zarr Files   | Size    |
-| ------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------- | ------------ | ------- |
-| [BIA Samples](https://bit.ly/bia-ome-ngff-samples)                       | Hosting provided by EBI                                                                                             | 90           | 200GB   |
-| [CZB-Zebrahub](https://zebrahub.ds.czbiohub.org/imaging)                 | Hosting provided by czbiohub                                                                                        | 5            | 1.2TB   |
-| [Glencoe](https://glencoesoftware.com/ngff)                              | Hosting provided by Glencoe Software, Inc.                                                                          | TBD          | TBD     |
-| [DANDI](https://dandiarchive.org/dandiset/000108) ([identifiers.org][dandi2],[github][dandi3]) | Hosting provided by AWS Open Data Program                                                     | 3914         | 355TB   |
-| [EMBL-HD](https://mobie.github.io/specs/ngff.html)                       | Hosting provided by EMBL                                                                                            | 21           | TBD     |
-| [IDR Samples](https://idr.github.io/ome-ngff-samples/)                   | Hosting provided by EBI                                                                                             | 88           | 3TB     |
-| [Neural Dynamics](https://registry.opendata.aws/allen-nd-open-data/)     | Hosting provided by AWS Open Data Program                                                                           | 90           | 200TB   |
-| [Sanger](https://www.sanger.ac.uk/project/ome-zarr/)                     | Hosting provided by Sanger, UK                                                                                      | 10           | 1TB     |
-| [SpatialData](https://github.com/scverse/spatialdata-notebooks/tree/main/datasets)                       | Hosting provided by EMBL                                                            | 10           | 25GB    |
-| [webKnossos](https://zarr.webknossos.org)                                | Hosting provided by scalableminds GmbH                                                                              | 69           | 70TB    |
-| [SSBD](https://ssbd.riken.jp/ssbd-ome-ngff-samples)                      | Hosting provided by SSBD                                                                                            | 12           | 196GB   |
+| Catalog                                                                  | Hosting                                              | Zarr Files   | Size     |
+| ------------------------------------------------------------------------ | -----------------------------------------------------| ------------ | -------- |
+| [BIA Samples](https://bit.ly/bia-ome-ngff-samples)                       | EBI                                                  | 90           | 200 GB   |
+| [Cell Painting Gallery](https://github.com/broadinstitute/cellpainting-gallery) | AWS Open Data Program                         | 136          | 20 TB    |
+| [CZB-Zebrahub](https://zebrahub.ds.czbiohub.org/imaging)                 | czbiohub                                             | 5            | 1.2 TB   |
+| [DANDI](https://dandiarchive.org/dandiset/000108) ([identifiers.org][dandi2],[github][dandi3]) | AWS Open Data Program          | 3914         | 355 TB   |
+| [Glencoe](https://glencoesoftware.com/ngff)                              | Glencoe Software, Inc.                               | 8            | 165 GB   |
+| [IDR Samples](https://idr.github.io/ome-ngff-samples/)                   | EBI                                                  | 88           | 3 TB     |
+| [MoBIE](https://mobie.github.io/specs/ngff.html)                         | EMBL-HD                                              | 21           | 2 TB     |
+| [Neural Dynamics](https://registry.opendata.aws/allen-nd-open-data/)     | AWS Open Data Program                                | 90           | 200 TB   |
+| [Sanger](https://www.sanger.ac.uk/project/ome-zarr/)                     | Sanger, UK                                           | 10           | 1 TB     |
+| [SpatialData](https://github.com/scverse/spatialdata-notebooks/tree/main/datasets)                       | EMBL-HD              | 10           | 25 GB    |
+| [SSBD](https://ssbd.riken.jp/ssbd-ome-ngff-samples)                      | SSBD                                                 | 12           | 196 GB   |
+| [webKnossos](https://zarr.webknossos.org)                                | scalableminds GmbH                                   | 69           | 70 TB    |
 
 [dandi2]: https://identifiers.org/DANDI:000108
 [dandi3]: https://github.com/dandisets/000108
diff --git a/index.rst b/index.rst
@@ -6,6 +6,14 @@
 Next-generation file formats (NGFF)
 ===================================
 
+OME-NGFF is an imaging format being developed by the bioimaging community to
+address issues of scalability and interoperability.
+Please see the :doc:`about/index` section for an introduction.
+The OME-NGFF specification is detailed under :doc:`specifications/index`.
+Various Image viewers and other software for working with NGFF data
+are listed on the :doc:`tools/index` page.
+Sample NGFF datasets provided by the community can be found under :doc:`data/index`.
+
 .. toctree::
    :maxdepth: 2
 
@@ -20,6 +28,3 @@ Next-generation file formats (NGFF)
 
 .. raw:: html
 
-    <script type="text/javascript">
-        window.location.replace('latest/index.html');
-    </script>
diff --git a/latest/index.bs b/latest/index.bs
@@ -26,60 +26,6 @@ Status Text: will be provided between numbered versions. Data written with these
 Status Text: (an "editor's draft") will not necessarily be supported.
 </pre>
 
-Introduction {#intro}
-=====================
-
-Bioimaging science is at a crossroads. Currently, the drive to acquire more,
-larger, preciser spatial measurements is unfortunately at odds with our ability
-to structure and share those measurements with others. During a global pandemic
-more than ever, we believe fervently that global, collaborative discovery as
-opposed to the post-publication, "data-on-request" mode of operation is the
-path forward. Bioimaging data should be shareable via open and commercial cloud
-resources without the need to download entire datasets.
-
-At the moment, that is not the norm. The plethora of data formats produced by
-imaging systems are ill-suited to remote sharing. Individual scientists
-typically lack the infrastructure they need to host these data themselves. When
-they acquire images from elsewhere, time-consuming translations and data
-cleaning are needed to interpret findings. Those same costs are multiplied when
-gathering data into online repositories where curator time can be the limiting
-factor before publication is possible. Without a common effort, each lab or
-resource is left building the tools they need and maintaining that
-infrastructure often without dedicated funding.
-
-This document defines a specification for bioimaging data to make it possible
-to enable the conversion of proprietary formats into a common, cloud-ready one.
-Such next-generation file formats layout data so that individual portions, or
-"chunks", of large data are reference-able eliminating the need to download
-entire datasets.
-
-
-Why "<dfn export="true"><abbr title="Next-generation file-format">NGFF</abbr></dfn>"? {#why-ngff}
--------------------------------------------------------------------------------------------------
-
-A short description of what is needed for an imaging format is "a hierarchy
-of n-dimensional (dense) arrays with metadata". This combination of features
-is certainly provided by <dfn export="true"><abbr title="Hierarchical Data Format 5">HDF5</abbr></dfn>
-from the <a href="https://www.hdfgroup.org">HDF Group</a>, which a number of
-bioimaging formats do use. HDF5 and other larger binary structures, however,
-are ill-suited for storage in the cloud where accessing individual chunks
-of data by name rather than seeking through a large file is at the heart of
-parallelization.
-
-As a result, a number of formats have been developed more recently which provide
-the basic data structure of an HDF5 file, but do so in a more cloud-friendly way.
-In the [PyData](https://pydata.org/) community, the Zarr [[zarr]] format was developed
-for easily storing collections of [NumPy](https://numpy.org/) arrays. In the
-[ImageJ](https://imagej.net/) community, N5 [[n5]] was developed to work around
-the limitations of HDF5 ("N5" was originally short for "Not-HDF5").
-Both of these formats permit storing individual chunks of data either locally in
-separate files or in cloud-based object stores as separate keys.
-
-A [current effort](https://zarr-specs.readthedocs.io/en/core-protocol-v3.0-dev/protocol/core/v3.0.html)
-is underway to unify the two similar specifications to provide a single binary
-specification. The editor's draft will soon be entering a [request for comments (RFC)](https://github.com/zarr-developers/zarr-specs/issues/101) phase with the goal of having a first version early in 2021. As that
-process comes to an end, this document will be updated.
-
 OME-NGFF {#ome-ngff}
 --------------------
 
@@ -380,7 +326,7 @@ Each "multiscales" dictionary MAY contain the field "coordinateTransformations",
 The transformations MUST follow the same rules about allowed types, order, etc. as in "datasets:coordinateTransformations" and are applied after them.
 They can for example be used to specify the `scale` for a dimension that is the same for all resolutions.
 
-Each "multiscales" dictionary SHOULD contain the field "name". It SHOULD contain the field "version", which indicates the version of the multiscale metadata of this image (current version is [NGFFVERSION]).
+Each "multiscales" dictionary SHOULD contain the field "name". It MUST contain the field "version", which indicates the version of the multiscale metadata of this image (current version is [NGFFVERSION]).
 
 Each "multiscales" dictionary SHOULD contain the field "type", which gives the type of downscaling method used to generate the multiscale image pyramid.
 It SHOULD contain the field "metadata", which contains a dictionary with additional information about the downscaling method.
@@ -546,7 +492,7 @@ contain only alphanumeric characters, MUST be case-sensitive, and MUST NOT be a
 other `name` in the `rows` list. Care SHOULD be taken to avoid collisions on
 case-insensitive filesystems (e.g. avoid using both `Aa` and `aA`).
 
-The `plate` dictionary SHOULD contain a `version` key whose value MUST be a string specifying the
+The `plate` dictionary MUST contain a `version` key whose value MUST be a string specifying the
 version of the plate specification.
 
 The `plate` dictionary MUST contain a `wells` key whose value MUST be a list of JSON objects
@@ -646,6 +592,9 @@ Projects which support reading and/or writing OME-NGFF data include:
   <dt><strong>[vizarr](https://github.com/hms-dbmi/vizarr/)</strong></dt>
   <dd>A minimal, purely client-side program for viewing Zarr-based images with Viv & ImJoy.</dd>
 
+  <dt><strong>[ITKIOOMEZarrNGFF](https://github.com/InsightSoftwareConsortium/ITKIOOMEZarrNGFF/)</strong></dt>
+  <dd>ITK IO for images stored in OME-NGFF format.</dd>
+
 </dl>
 
 <img src="https://downloads.openmicroscopy.org/presentations/2020/Dundee/Workshops/NGFF/zarr_diagram/images/zarr-ome-diagram.png" alt="Diagram of related projects"></img>

diff --git a/latest/schemas/image.schema b/latest/schemas/image.schema
@@ -44,7 +44,7 @@
                 }
         },
         "required": [
-          "datasets", "axes"
+          "datasets", "axes", "version"
         ]
       },
       "minItems": 1,

diff --git a/latest/schemas/plate.schema b/latest/schemas/plate.schema
@@ -133,7 +133,7 @@
         }
       },
       "required": [
-        "columns", "rows", "wells"
+        "columns", "rows", "wells", "version"
       ]
     }
   }

diff --git a/latest/tests/image_suite.json b/latest/tests/image_suite.json
@@ -90,7 +90,7 @@
       "valid": true
     },
     {
-      "formerly": "valid/missing_version.json",
+      "formerly": "invalid/missing_version.json",
       "description": "TBD",
       "data": {
         "@type": "ngff:Image",
@@ -126,7 +126,7 @@
           }
         ]
       },
-      "valid": true
+      "valid": false
     },
     {
       "formerly": "valid/invalid_axis_units.json",
@@ -857,6 +857,43 @@
       },
       "valid": false
     },
+    {
+      "formerly": "invalid/missing_version.json",
+      "description": "TBD",
+      "data": {
+        "multiscales": [
+          {
+            "axes": [
+              {
+                "name": "y",
+                "type": "space",
+                "unit": "micrometer"
+              },
+              {
+                "name": "x",
+                "type": "space",
+                "unit": "micrometer"
+              }
+            ],
+            "datasets": [
+              {
+                "path": "0",
+                "coordinateTransformations": [
+                  {
+                    "scale": [
+                      1,
+                      1
+                    ],
+                    "type": "scale"
+                  }
+                ]
+              }
+            ]
+          }
+        ]
+      },
+      "valid": false
+    },
     {
       "formerly": "invalid/invalid_axis_type.json",
       "description": "TBD",
-Original file line number
+Diff line change
@@ Expand Up / @@ -133,7 +133,7 @@ @@
             }
           },
           "required": [
-            "columns", "rows", "wells"
+            "columns", "rows", "wells", "version"
           ]
         }
       }
@@ Expand Down @@