Skip to content

Commit

Permalink
Refactor common parts of the spec #1203 (#1304)
Browse files Browse the repository at this point in the history
* Refactor common parts of the spec #1203

* Added more docs for commons
  • Loading branch information
m-mohr authored Aug 6, 2024
1 parent 590e392 commit 5076180
Show file tree
Hide file tree
Showing 14 changed files with 839 additions and 821 deletions.
14 changes: 8 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,9 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- All JSON Schema `$id` values no longer have `#` at the end.
- Two spatial bounding boxes in a Collection don't make sense and will be reported as invalid by the schema. ([#1243](https://github.com/radiantearth/stac-spec/issues/1243))
- Clarify in descriptions that start_datetime and end_datetime are inclusive bounds ([#1280](https://github.com/radiantearth/stac-spec/issues/1280))
- Moved the STAC structural relations in common metadata spec
- Moved the STAC structural relations into commons
- Moved general descriptions about Assets and Links into commons
- Moved common metadata from the item-spec into commons, but kept the JSON schemas in the item-spec for backward compatibility

### Deprecated

Expand Down Expand Up @@ -218,7 +220,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
### Added
- ItemCollection requires `stac_version` field, `stac_extensions` has also been added
- A `description` field has been added to Item assets (also Asset definitions extension)
- Field `mission` to [Common Metadata fields](item-spec/common-metadata.md)
- Field `mission` to [Common Metadata fields](commons/common-metadata.md)
- Extensions:
- [Version Indicators extension](https://github.com/stac-extensions/version/blob/main/README.md), new `version` and `deprecated` fields in STAC Items and Collections
- Data Cube extension can be used in Collections, added new field `description`
Expand All @@ -228,7 +230,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
- STAC API:
- Added the [Item and Collection API Version extension](https://github.com/radiantearth/stac-api-spec/tree/master/extensions/version/README.md) to support versioning in the API specification
- Run `npm run serve` or `npm run serve-ext` to quickly render development versions of the OpenAPI spec in the browser
- [Basics](item-spec/common-metadata.md#basics) added to Common Metadata definitions with new `description` field for
- [Basics](commons/common-metadata.md#basics) added to Common Metadata definitions with new `description` field for
Item properties
- New fields to the `link` object to facilitate [pagination support for POST requests](https://github.com/radiantearth/stac-api-spec/tree/master/api-spec.md#paging-extension)
- `data` role, as a suggestion for a common role for data files to be used in case data providers don't come up with their own names and semantics
Expand All @@ -240,7 +242,7 @@ Item properties
- Added field `roles` to Item assets (also Asset definitions extension), to be used similarly to Link `rel`
- Updated API yaml to clarify bbox filter should be implemented without brackets. Example: `bbox=160.6,-55.95,-170,-25.89`
- Collection `summaries` merge array fields now
- Several fields have been moved from extensions or item fields to the [Common Metadata fields](item-spec/common-metadata.md):
- Several fields have been moved from extensions or item fields to the [Common Metadata fields](commons/common-metadata.md):
- `eo:platform` / `sar:platform` => `platform`
- `eo:instrument` / `sar:instrument` => `instruments`, also changed from string to array of strings
- `eo:constellation` / `sar:constellation` => `constellation`
Expand All @@ -264,7 +266,7 @@ Item properties
- `search` extension renamed to `context` extension. JSON object renamed from `search:metadata` to `context`
- Removed "next" from the search metadata and query parameter, added POST body and headers to the links for paging support
- Query Extension - type restrictions on query predicates are more accurate, which may require additional implementation support
- Item `title` definition moved from core Item fields to [Common Metadata Basics](item-spec/common-metadata.md#basics)
- Item `title` definition moved from core Item fields to [Common Metadata Basics](commons/common-metadata.md#basics)
fields. No change is required for STAC Items.
- `putFeature` can return a `PreconditionFailed` to provide more explicit information when the resource has changed in the server
- [Sort extension](https://github.com/radiantearth/stac-api-spec/tree/master/extensions/sort) now uses "+" and "-" prefixes for GET requests to denote sort order.
Expand All @@ -281,7 +283,7 @@ fields. No change is required for STAC Items.
- `gsd` and `accuracy` from `eo:bands` in the [EO extension](https://github.com/stac-extensions/eo/blob/main/README.md)
- `sar:absolute_orbit` and `sar:center_wavelength` fields from the [SAR extension](https://github.com/stac-extensions/sar/blob/main/README.md)
- `data_type` and `unit` from the `sar:bands` object in the [SAR extension](https://github.com/stac-extensions/sar/blob/main/README.md)
- Datetime Range (`dtr`) extension. Use the [Common Metadata fields](item-spec/common-metadata.md) instead
- Datetime Range (`dtr`) extension. Use the [Common Metadata fields](commons/common-metadata.md) instead
- STAC API:
- `next` from the search metadata and query parameter
- In API, removed any mention of using media type `multipart/form-data` and `x-www-form-urlencoded`
Expand Down
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,9 @@ used together, but are designed so each piece is small, self-contained, and reus
In the context of STAC it is most likely a related group of STAC Items that is made available by a data provider.
It includes things like the spatial and temporal extent of the data, the license, keywords, etc.
It enables discovery at a higher level than individual Item objects, providing a simple way to describe sets of data.
- **[Commons](commons/)** describes parts of the specification that are shared across the specifications listed above.
This includes [assets](commons/assets.md), [links](commons/links.md)
and [common metadata](commons/common-metadata.md).
- **[Examples](examples/):** The *[examples/](examples/)* folder contains examples for all three specifications, linked together to form two
complete examples. Each spec and extension links in to highlight particular files that demonstrate key concepts.
- **[Extensions](extensions/README.md)** describe how STAC can use extensions that extend the functionality of the core spec or
Expand Down
27 changes: 14 additions & 13 deletions best-practices.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,8 +195,8 @@ instead of thinking through all the ways providers might have chosen to name it.
In general STAC aims to be oriented around **search**, centered on the core fields that users will want to search on to find
imagery. The core is space and time, but there are often other metadata fields that are useful. While the specification is
flexible enough that providers can fill it with tens or even hundreds of fields of metadata, that is not recommended. If
providers have lots of metadata then that can be linked to in the [Asset Object](item-spec/item-spec.md#asset-object)
(recommended) or in a [Link Object](item-spec/item-spec.md#link-object). There is a lot of metadata that is only of relevance
providers have lots of metadata then that can be linked to in the [Asset Object](commons/assets.md#asset-object)
(recommended) or in a [Link Object](commons/links.md#link-object). There is a lot of metadata that is only of relevance
to loading and processing data, and while STAC does not prohibit providers from putting those type of fields in their items,
it is not recommended. For very large catalogs (hundreds of millions of records),
every additional field that is indexed will cost substantial money, so data providers are advised to just put the fields to be searched in STAC and
Expand All @@ -209,7 +209,7 @@ STAC. And it can also be one of the most confusing, especially for data that cov
is straightforward - it is the capture or acquisition time. But often data is processed from a range of captures - drones usually
gather a set of images over an hour and put them into a single image, mosaics combine data from several months, and data cubes
represent slices of data over a range of time. For all these cases the recommended path is to use `start_datetime` and
`end_datetime` fields from [common metadata](item-spec/common-metadata.md#date-and-time-range). The specification does allow one to set the
`end_datetime` fields from [common metadata](commons/common-metadata.md#date-and-time-range). The specification does allow one to set the
`datetime` field to `null`, but it is strongly recommended to populate the single `datetime` field, as that is what many clients
will search on. If it is at all possible to pick a nominal or representative datetime then that should be used. But sometimes that
is not possible, like a data cube that covers a time range from 1900 to 2000. Setting the datetime as 1950 would lead to it not
Expand Down Expand Up @@ -258,7 +258,7 @@ not spatial. This use case is not currently supported by STAC, as we are focused
in nature. The [OGC API - Records](https://github.com/opengeospatial/ogcapi-records) is an emerging standard that likely
will be able to handle a wider range of data than STAC. It builds on [OGC API -
Features](https://github.com/opengeospatial/ogcapi-features) just like [STAC API](https://github.com/radiantearth/stac-api-spec/)
does. Using [Collection Assets](collection-spec/collection-spec.md#asset-object) may also provide an option for some
does. Using [Collection Assets](collection-spec/collection-spec.md#assets) may also provide an option for some
use cases.

### Representing Vector Layers in STAC
Expand All @@ -277,14 +277,14 @@ Both are compliant with OGC API - Features, adding richer search capabilities to

### Common Use Cases of Additional Fields for Assets

As [described in the Item spec](item-spec/item-spec.md#additional-fields-for-assets), it is possible to use fields typically
As [described in the Item spec](commons/assets.md#additional-fields), it is possible to use fields typically
found in Item properties at the asset level. This mechanism of overriding or providing Item Properties only in the Assets
makes discovery more difficult and should generally be avoided. However, there are some core and extension fields for which
providing them at the Asset level can prove to be very useful for using the data.

- `datetime`: Provide individual timestamp on an Item, in case the Item has a `start_datetime` and `end_datetime`,
but an Asset is for one specific time.
- `gsd` ([Common Metadata](item-spec/common-metadata.md#instrument)): Specify some assets that represent instruments
- `gsd` ([Common Metadata](commons/common-metadata.md#instrument)): Specify some assets that represent instruments
with different spatial resolution than the overall best resolution. Note this should not be used for different
spatial resolutions due to specific processing of assets - look into the [raster
extension](https://github.com/stac-extensions/raster) for that use case.
Expand Down Expand Up @@ -368,7 +368,7 @@ it. It is relatively easy to [register](https://www.iana.org/form/media-types) a

### Asset Roles

[Asset roles](item-spec/item-spec.md#asset-roles) are used to describe what each asset is used for. They are particular useful
[Asset roles](commons/assets.md#roles) are used to describe what each asset is used for. They are particular useful
when several assets have the same media type, such as when an Item has a multispectral analytic asset, a 3-band full resolution
visual asset, a down-sampled preview asset, and a cloud mask asset, all stored as Cloud Optimized GeoTIFF (COG) images. It is
recommended to use at least one role for every asset available, and using multiple roles often makes sense. For example you'd use
Expand Down Expand Up @@ -783,25 +783,25 @@ while a value of 15 to 40 would tell them that it's oblique imagery, or 0 to 60
a Collection with lots of different look angles.

- Fields that have only one or a handful of values are also great to summarize. Collections with a single satellite may
use a single [`gsd`](item-spec/common-metadata.md#instrument) field in the summary, and it's quite useful for users to know
use a single [`gsd`](commons/common-metadata.md#instrument) field in the summary, and it's quite useful for users to know
that all data is going to be the same resolution. Similarly it's useful to know the names of all the
[`platform` values](item-spec/common-metadata.md#instrument) that are used in the Collection.
[`platform` values](commons/common-metadata.md#instrument) that are used in the Collection.

- It is less useful to summarize fields that have numerous different discrete values that can't easily be represented
in a range. These will mostly be string values, when there aren't just a handful of options. For example if you had a
'location' field that gave 3 levels of administrative region (like 'San Francisco, California, United States') to help people
understand more intuitively where a shot was taken. If your Collection has millions of Items, or even hundreds, you don't want
to include all the different location string values in a summary.

- Fields that consist of arrays are more of a judgement call. For example [`instruments`](item-spec/common-metadata.md#instrument)
- Fields that consist of arrays are more of a judgement call. For example [`instruments`](commons/common-metadata.md#instrument)
is straightforward and recommended, as the elements of the array are a discrete set of options. On the other hand
[`proj:transform`](https://github.com/stac-extensions/projection/blob/main/README.md#projtransform)
makes no sense to summarize, as the union of all the values
in the array are meaningless, as each Item is describing its transform, so combining them would just be a bunch of random numbers.
So if the values contained in the array are independently meaningful (not interconnected) and there aren't hundreds of potential
values then it is likely a good candidate to summarize.

We do highly recommend including a [`bands`](./item-spec/common-metadata.md#bands)
We do highly recommend including a [`bands`](./commons/common-metadata.md#bands)
summary if your Items implement `bands`,
especially if it represents just one satellite or constellation. This should be a union of all the potential bands that you
have in assets. It is ok to only add the summary at the Collection level without putting `bands` at the
Expand Down Expand Up @@ -1009,8 +1009,9 @@ When crawling a STAC implementation, one can also make use of the [relation type
) (`rel` field) when following a link. If it is an `item` rel type then the file must be a STAC Item. If it is `child`, `parent` or
`root` then it must be a Catalog or a Collection, though the final determination between the two requires looking at the `type` field
in the Catalog or Collection JSON that it is linked to. Note that there is also a `type` field in STAC Link and Asset objects, but that
is for the Media Type, but there are not specific media types for Catalog and Collection. See the sections on [STAC media
types](catalog-spec/catalog-spec.md#media-types), and [Asset media types](item-spec/item-spec.md#asset-media-type) for more information.
is for the Media Type, but there are not specific media types for Catalog and Collection.
See the sections on [STAC media types](commons/links.md#stac-media-types),
and [Asset media types](commons/assets.md#media-types) for more information.
In versions of STAC prior to 1.0 the process was a bit more complicated, as there was no `type` field for catalogs and collections.
See [this issue comment](https://github.com/radiantearth/stac-spec/issues/889#issuecomment-684529444) for a heuristic that works
Expand Down
2 changes: 1 addition & 1 deletion catalog-spec/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ It includes in depth explanation of the structures and fields.
**Schemas:** The schemas to validate the core Catalog definition are found in the *[json-schema/](json-schema/)* folder.
The primary one is *[catalog.json](json-schema/catalog.json)*.

## Catalog Evolution
## Catalog Evolution

The Catalog specification is maturing, but it is still relatively early days. The core of Catalog has been defined very
narrowly, to just describe a structure that can be followed by people or machines, so most additional functionality will
Expand Down
Loading

0 comments on commit 5076180

Please sign in to comment.