-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-run batch job using triggers #384
Comments
Examples of tools that we use at VITO for this, in production:
In the past we also researched other web service orchestration mechanisms. In general, this is a quite broad topic, as it also links to other things. These are some of the features we additionally need in production systems:
Especially this last point is important. We found out that a webservice like openEO cannot easily trigger other systems that belong to an organization. One reason is the complex authentication/authorization involved, but it also requires accessibility on a networking level. That's why doing the batch job orchestration in an external system makes much more sense from an architectural point of view. The use case of getting notifications when new products are added, is also something that should first be solved in catalog systems, like STAC and opensearch. Currently, it usually involves searching on products based on the ingestion timestamp. |
Thanks, @jdries - a lot of good details in here.
Good point. We could easily have a process send e-mails after a successful run (i.e. a notification process running after save_result), but this doesn't work for errors. So we'd need something better, e.g. a trigger on status change. I think in 0.3 we had the Web Socket notifications in the API, but removed them for simplicity reasons.
Is this just metadata (i.e.
We can easily do that to run other jobs through the openEO API, but once it we need to trigger something external, we are back to what we said above about dashboards etc.
Indeed, but that seems to be only feasible within the organization itself. That is not (yet?) open to the public, right? If it's meant to be only used internally anyway, it seems out of scope for openEO.
I'm not sure whether it is the scope of these projects to do this, especially as HTTP as a "pull" protocol is not the obvious option for push notifications. I've not seen that standardized in any of the OGC specifications (e.g. OGC APIs, OpenSearch), but it seems like we should ask whether they have something like that in the pipeline. Maybe there's also interest in the STAC community, but all efforts are likely to take far beyond our project runtime if we don't step in ourselves (or find an existing solution). |
OGC did this: For the original use case (in the platform), I would rather solve this with an of the shelve component, such as airflow, or anything else that's easy to set up as a multi-user server. Maybe even do a prototype just for the usecase, and then expose to broader public if it actually works well. |
Yes. It seems a REST interface has been worked on, but has maybe never finished? See https://52north.org/files/sensorweb/agile/2017/workshop/Rieke_REST_PubSub.pdf I've also found opengeospatial/ogcapi-common#231, which seems related.
Here, I'm looking for an interoperable solution for the openEO API, of course. But I understand that we may not have that in the API anytime soon, but rather go with a proprietary way in openEO Platform for now and then evolve. That of course makes it harder to manage things through the openEO tooling / clients, but maybe that's not even required and we just expose it as part of the openEO Platform service web interface, which doesn't actually touch the openEO API side itself. |
There are use cases for which it could be useful to re-run a batch job based on some criteria the user has to define.
Some examples for criteria that could trigger re-running a job:
load_collection
requests) - see Platform UC6What other conditions could be useful? I'm trying to get an idea of what could be useful so that I can design the API in a way that it's flexible enough for all use cases, but not necessarily supporting all use cases right at the beginning.
If someone knows a platform, service or other API that already has support for it, I'd be thankful for a hint. It's always better to learn from others than to reinvent the wheel.
The text was updated successfully, but these errors were encountered: