-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
minimal profile for backends #56
Conversation
## Collections | ||
A minimal backend shall have at least one raster collection, allowing to build raster data cubes. | ||
|
||
## Processes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have such a list: https://github.com/openEOPlatform/documentation/blob/main/federation/backends/processes.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, but the problem is that ESA asked for something even more minimal, as the core profile for openEO platform is already quite demanding. (Minimal openEO backends would not be able to join the openEO platform federation, then they would indeed hace to support the core profile.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... somewhat makes sense. A couple of questions arise though:
- How was this list compiled? Looking at the "minimal" list is looks like it could be more minimal (e.g. remove normalized_difference? filters?).
- What's the difference between a minimal openEO backend and a backend that wants to join the federation? Why do we specifiy something for a minimal openEO backend here? Thus, I understand this as "minimal openEO Platform backend".
- What's the use of having two lists? One specified a minimal profile, the other one was until now a requirement to be implemented as part of the federation contract. That's contradicting.
- Should we have this in one place and one table?
- Do these lists actually make sense at all? Maybe more as a guideline? A "vector backend" would never be able to fullfill these...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo it makes no sense to provide and maintain two different lists as platform if we don't do anything with this one here.
As it's not that easy to actually fulfil the platform core profile (as discussed our backend is missing some of them) I would be in favour of a simpler list. Having all processes listed here would of course not mean that you automatically become a platform backend (there are other points to consider that should be clearer defined somewhere).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A profile is indeed mostly a guideline. Basically, anyone wanting to implement a backend will need guidance on which processes to implement first, and then a minimal list like this can be helpful.
If we don't have a 'minimal' list for raster backends, providers can implement anything at all and call themselves openEO compliant.
If we only have the openEO platform list, then the threshold is quite high.
Hence the two lists?
As to the contents of the list, I'm fine with reducing it or adapting in any way.
I specified it here because didn't have another good location, but openEO.org is also a good option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't have a 'minimal' list for raster backends, providers can implement anything at all and call themselves openEO compliant. [...] but openEO.org is also a good option?
Indeed and that is intentional. You can choose what you need or want, openEO doesn't tell you what you have to implement. If you just implement load_collection and save_result you can be openEO compliant and I don't think why we should raise the bar. Guidance is okay, but it shouldn't be required to be openEO compliant. But that's unrelated to openEO Platform, where we can define a profile for becoming part of the federation or so. But we have two lists now and I still don't exactly understand why we need two of them.
If you actually want to define a "minimal" guideline for openEO itself, I'd say propose it in the PSC and/or community meetings. If this is meant for openEO Platform then it's a different story and should live here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, moved the discussion elsewhere so we can indeed have it in the PSC:
https://github.com/Open-EO/openeo-api/issues/490
Co-authored-by: Matthias Mohr <[email protected]>
|
||
https://api.openeo.org/#tag/Capabilities/operation/connect | ||
|
||
[Authentication](https://api.openeo.org/#section/Authentication) is also mandatory, but this profile does not define a specific method. We highlight that openID connect + PKCE is one of the only standardized and secure methods that are available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PKCE is conceptually only a consideration between the client and identity provider, the back-end has little involvement (at most in specifying the grant_types in the default_clients for providers under /credentials/oidc
). To keep the scope of this minimal profile minimal, it is maybe better to not mention PKCE?
|
||
|
||
## File formats | ||
[GeoTiff](./fileformats.md#geotiff) is the only mandatory output format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have GeoJSON as mandatory input/output format too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does any of the processes listed actually require this as output format? I don't think so so am wondering why we would list it here as output format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as output format probably not, but filter_spatial
requires it as input format I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True! If we add input format here it should be included as the list filter_spatial
below as minimal process.
|
||
https://api.openeo.org/#tag/Capabilities | ||
|
||
https://api.openeo.org/#tag/Capabilities/operation/connect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@m-mohr are these #tag/...
URL stable enough to point to a specific API subset or are there better ways?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As stable as redoc is... Should be good enough for now.
### Data Cubes | ||
|
||
- `apply`: Apply a process to each pixel | ||
- `apply_dimension`: Apply a process to pixels along a dimension |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't reduce_dimension
a better candidate for a minimal profile than apply_dimension
? (especially because there are no UDFs in the minimal profile)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, under Math and logic below there are currently no processes that take an array as input (like mean, min, max), so apply_dimension
nor reduce_dimension
can not be used anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, you need reduce_dimension to be able to easily get results for the only supported file format, GeoTiff, in the minimal profile. So reduce_dimension should be included with at least a small number of reducers for composites (min/max/mean?), I'm not sure about apply_dimension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Batch jobs do allow writing to geotiff without reduce_dimension, it just writes a geotiff per date in that case.
But we can indeed switch to reduce_dimension, not sure why I went for apply_dimension...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, your implementation can do that and it's allowed, indeed. But that's not minimal for me. Minimal, for me, is providing a 3D cube to write a single GeoTiff.
- `not`: Inverting a boolean | ||
- `or`: Logical OR | ||
- `xor`: Logical XOR (exclusive or) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted above about apply_dimension
: there is probably need for some minimal array proceses too: array_element, mean, min, max?
## Collections | ||
A minimal backend shall have at least one raster collection, allowing to build raster data cubes. | ||
|
||
## Processes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imo it makes no sense to provide and maintain two different lists as platform if we don't do anything with this one here.
As it's not that easy to actually fulfil the platform core profile (as discussed our backend is missing some of them) I would be in favour of a simpler list. Having all processes listed here would of course not mean that you automatically become a platform backend (there are other points to consider that should be clearer defined somewhere).
- `gt`: Greater than comparison | ||
- `gte`: Greater than or equal to comparison | ||
- `if`: If-Then-Else conditional | ||
- `is_infinite`: Value is an infinite number - **experimental** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not list experimental processes as required in a minimal_profile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it is also not implemented yet by any backend afaik. I thought backends would implement it soon when we added it to the other profile as it's a relatively simple process, but that did not happen yet. So I think we can remove it for now (or start implementing it).
|
||
For backends with more than one collection: | ||
- `merge_cubes`: Merge two data cubes | ||
- `resample_cube_spatial`: Resample the spatial dimensions to match a target data cube |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this one is not that minimal. It's from this list the only one we couldn't implemented in our backend. This is certainly due to our approach but maybe also means it's a more complicated to support than others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resample_cube_spatial is now implicitly part of merge_cubes and mask as of v2.0.0 so if we have merge_cubes it seems we should also have resample_cube_spatial. But hearing that SH can't implement it(?), I fear that we may actually regret doing Open-EO/openeo-processes#405 - Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dthiex if you do support merge_cubes, but not resample_cube_spatial, is it possible that cubes on sentinelhub or somehow aligned already by default more easily? Or does merge_cubes result in an error in a lot of cases?
In any case, if you remove it from this list, then there's a lot of example process graphs that don't work on a minimal backend.
@@ -0,0 +1,175 @@ | |||
# Minimal profile for openEO backends | |||
This document describes the requirements for a backend that aims to provide minimal functionality for access and processing of gridded | |||
earth observation data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earth Observation (EO)
|
||
|
||
## File formats | ||
[GeoTiff](./fileformats.md#geotiff) is the only mandatory output format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does any of the processes listed actually require this as output format? I don't think so so am wondering why we would list it here as output format.
@jdries I assume we currently don't want to merge this and want to wait for the profiles defined within SAP09. Can we close this? I don't think the effort here was useless and once we have the profiles define we can us the description here and just need to updated it with the respective info. |
indeed closing this one, in fact, we agreed to open one in openeo.org |
No description provided.