Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document process differences between back-ends #12

Closed
m-mohr opened this issue Aug 25, 2021 · 7 comments
Closed

Document process differences between back-ends #12

m-mohr opened this issue Aug 25, 2021 · 7 comments
Assignees

Comments

@m-mohr
Copy link
Collaborator

m-mohr commented Aug 25, 2021

see https://docs.openeo.cloud/federation/#processes

Is that an aggregator-related task? @soxofaan

@soxofaan
Copy link
Contributor

soxofaan commented Aug 25, 2021

I already added discussion about current implementation of process federation in f23ba9f (related to #11 )

are there more questions to address?

@m-mohr
Copy link
Collaborator Author

m-mohr commented Aug 25, 2021

Thanks. As this is user documentation so I feel this is a bit too technical (i.e. remove the GET /processes mention? No one will see the request...). Processes are the intersection, I understand, but are the processes all compatible or are there differences in e.g. parameter availability or are parameters also the intersection? This was meant to list common pitfalls users can run into due to back-end differences. I'm not sure what exactly that is and maybe it's already complete, but we should confirm that and then can close this.

@soxofaan
Copy link
Contributor

a bit too technical (i.e. remove the GET /processes mention?

yes makes sense. I thought that it was still ok to go a bit deeper because that page is currently listed under the "advanced" menu item

but are the processes all compatible or are there differences in e.g. parameter availability
... common pitfalls users can run into due to back-end differences. I'm not sure what exactly that is

To be honest, me neither, I also have no idea what the pitfalls or the differences (currently) are. I haven't really played with different use cases on multiple back-ends (except for some very basic examples).

Of course, it's not that hard to just do comparison of the process descriptions, but I guess you want to go a bit further and list actual differences that are not described there.
Also, if there are differences, it's probably better to resolve them in some way than to document them here.

@m-mohr
Copy link
Collaborator Author

m-mohr commented Aug 26, 2021

I assume that the processes are correctly documented, i.e. that if something is not supported, it's not listed (e.g. parameters, data types etc). Otherwise it gets difficult to figure out obvious differences by comparing schemas programmatically.

One example that comes immediately to mind is documenting the differences in the load_collection properties argument, where I'm pretty sure what is supported is not aligned and thus could be documented. I guess there are more out there, also for EODC of course. Just looking at the open issues for the GeoPySpark driver I could see some, e.g. first/last in reduce_dimension and types of geometries in aggregate_spatial (if any of that is supported at EODC).

Sure, if you can fix the issues before the public launch then you can omit the documentation, but at least from EODC's side it seems time has been an issue recently. Also, having it documented could be a good guide for alignment in general.

By the way @soxofaan, I just assigned you here because I had hoped you could report some obvious things from the aggregator, but I only realized afterwards that the aggregator only provides the intersection of processes. I thought it would be the union and then user would need to figure out what runs where. Anyway, feel free to reassign or so, if someone else would be better suited.

@m-mohr m-mohr changed the title Document process differences between back-ends (or how to figure them out) Document process differences between back-ends Aug 26, 2021
@soxofaan
Copy link
Contributor

soxofaan commented Aug 26, 2021

Also, having it documented could be a good guide for alignment in general.

I wonder if it's a good idea to work with manually written docs for that. If we fix some issue in any of the underlying back-ends, we will probably completely forget to update this document. A quick alternative could be using the issue queue of, for example, the aggregator (https://github.com/Open-EO/openeo-aggregator/issues) and use a specific label which we can filter on to give a dynamic overview of "federation issues".
Maybe it's eve possible to embed that listing on a docs.openeo.cloud page

@m-mohr
Copy link
Collaborator Author

m-mohr commented Aug 26, 2021

Sure, fine with that. Although you can also forget them there (because they are usually fixed in eodc-driver/geopyspark-driver and not in the aggregator, I guess). I see the federation aspects doc as one that we need to revisit pretty often anyway as we also need to add emerging differences in the future, too.

And thanks for the last update in 23bc493, that reads much better.

@soxofaan
Copy link
Contributor

FYI: it's unclear to me what the difference between #11 and #12 is, I'd propose to close #12 and move to #11 (which has more eyes on it). Feel free to re-open if there is good reason for that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants