Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-run batch job using triggers #384

Open
m-mohr opened this issue Apr 23, 2021 · 5 comments
Open

Re-run batch job using triggers #384

m-mohr opened this issue Apr 23, 2021 · 5 comments

Comments

@m-mohr
Copy link
Member

m-mohr commented Apr 23, 2021

There are use cases for which it could be useful to re-run a batch job based on some criteria the user has to define.

Some examples for criteria that could trigger re-running a job:

  • New data is available (for the id and extents specified in load_collection requests) - see Platform UC6
  • Temporal criteria (e.g. daily, weekly, every second Wednesday at 20:00:00, ...) - see Cron Jobs

What other conditions could be useful? I'm trying to get an idea of what could be useful so that I can design the API in a way that it's flexible enough for all use cases, but not necessarily supporting all use cases right at the beginning.

If someone knows a platform, service or other API that already has support for it, I'd be thankful for a hint. It's always better to learn from others than to reinvent the wheel.

@m-mohr m-mohr added this to the 1.1.0 milestone Apr 23, 2021
@m-mohr m-mohr self-assigned this Apr 23, 2021
@jdries
Copy link

jdries commented Apr 26, 2021

Examples of tools that we use at VITO for this, in production:

  • Apache Airflow
  • Apache NiFi

In the past we also researched other web service orchestration mechanisms. In general, this is a quite broad topic, as it also links to other things. These are some of the features we additionally need in production systems:

  • (error) notifications to email, but also to other systems or online dashboards
  • full tracking of historic runs for the same job
  • triggering of subsequent steps, when the job finishes

Especially this last point is important. We found out that a webservice like openEO cannot easily trigger other systems that belong to an organization. One reason is the complex authentication/authorization involved, but it also requires accessibility on a networking level. That's why doing the batch job orchestration in an external system makes much more sense from an architectural point of view.

The use case of getting notifications when new products are added, is also something that should first be solved in catalog systems, like STAC and opensearch. Currently, it usually involves searching on products based on the ingestion timestamp.

@m-mohr
Copy link
Member Author

m-mohr commented Apr 26, 2021

Thanks, @jdries - a lot of good details in here.

  • (error) notifications to email, but also to other systems or online dashboards

Good point. We could easily have a process send e-mails after a successful run (i.e. a notification process running after save_result), but this doesn't work for errors. So we'd need something better, e.g. a trigger on status change. I think in 0.3 we had the Web Socket notifications in the API, but removed them for simplicity reasons.
While e-mail notifications seem easy, I guess for Dashboard and other systems we'd need something that is already standardized and can be plugged in easily into those systems. But is such a standard available? It's probably not enough to use a well-known protocol such as MQTT, but you also need to get the content right. What do the mentioned other systems and dashboards understand?

  • full tracking of historic runs for the same job

Is this just metadata (i.e. GET /jobs/{id}) or does this include also the processed data (i.e. the results)?

  • triggering of subsequent steps, when the job finishes

We can easily do that to run other jobs through the openEO API, but once it we need to trigger something external, we are back to what we said above about dashboards etc.

That's why doing the batch job orchestration in an external system makes much more sense from an architectural point of view.

Indeed, but that seems to be only feasible within the organization itself. That is not (yet?) open to the public, right? If it's meant to be only used internally anyway, it seems out of scope for openEO.

The use case of getting notifications when new products are added, is also something that should first be solved in catalog systems, like STAC and opensearch. Currently, it usually involves searching on products based on the ingestion timestamp.

I'm not sure whether it is the scope of these projects to do this, especially as HTTP as a "pull" protocol is not the obvious option for push notifications. I've not seen that standardized in any of the OGC specifications (e.g. OGC APIs, OpenSearch), but it seems like we should ask whether they have something like that in the pipeline. Maybe there's also interest in the STAC community, but all efforts are likely to take far beyond our project runtime if we don't step in ourselves (or find an existing solution).

@m-mohr
Copy link
Member Author

m-mohr commented Apr 26, 2021

@jdries
Copy link

jdries commented Apr 26, 2021

OGC did this:
https://www.ogc.org/standards/pubsub
I had to work with it myself in a few testbeds. I believe it was based on soap, so quite outdated.

For the original use case (in the platform), I would rather solve this with an of the shelve component, such as airflow, or anything else that's easy to set up as a multi-user server. Maybe even do a prototype just for the usecase, and then expose to broader public if it actually works well.
One other alternative would be that somebody creates a separate microservice that handles this, and runs on federation level, which can then also evolve into something that's somehow officially part of openEO. Only don't see where we would get the personmonths to do that (same problem as integrating it in the backends).

@m-mohr
Copy link
Member Author

m-mohr commented Apr 26, 2021

OGC did this:
https://www.ogc.org/standards/pubsub
I had to work with it myself in a few testbeds. I believe it was based on soap, so quite outdated.

Yes. It seems a REST interface has been worked on, but has maybe never finished? See https://52north.org/files/sensorweb/agile/2017/workshop/Rieke_REST_PubSub.pdf
I still have to figure out what exactly PubSub specifies (e.g. in addition to MQTT) and how much we'd have to add to make it work for our use case though.

I've also found opengeospatial/ogcapi-common#231, which seems related.

For the original use case (in the platform), I would rather solve this with an of the shelve component, such as airflow, or anything else that's easy to set up as a multi-user server. Maybe even do a prototype just for the usecase, and then expose to broader public if it actually works well.
One other alternative would be that somebody creates a separate microservice that handles this, and runs on federation level, which can then also evolve into something that's somehow officially part of openEO. Only don't see where we would get the personmonths to do that (same problem as integrating it in the backends).

Here, I'm looking for an interoperable solution for the openEO API, of course. But I understand that we may not have that in the API anytime soon, but rather go with a proprietary way in openEO Platform for now and then evolve. That of course makes it harder to manage things through the openEO tooling / clients, but maybe that's not even required and we just expose it as part of the openEO Platform service web interface, which doesn't actually touch the openEO API side itself.

@m-mohr m-mohr modified the milestones: 1.1.0, 1.2.0 May 5, 2021
@m-mohr m-mohr removed their assignment May 18, 2021
@m-mohr m-mohr modified the milestones: 1.2.0, 1.3.0 Nov 29, 2021
@m-mohr m-mohr removed this from the 1.3.0 milestone Sep 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants