Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Details on relation between callback input and output values for data cube operations #216

Closed
m-mohr opened this issue Jan 6, 2021 · 7 comments
Milestone

Comments

@m-mohr
Copy link
Member

m-mohr commented Jan 6, 2021

Originates from #215 (comment)

We should check whether we want to give more details in the return values regarding what values are "usually" acceptable. See the discussion for more details.

@m-mohr
Copy link
Member Author

m-mohr commented Apr 12, 2021

@soxofaan Do you think adding something like this to the reduce_dimension's reducer return value would be enough?

Note: In some cases it may not be possible that the data type of the return value is different from the data type of the values in the array data. For example, if data provides you with an array of numbers, you can return a number, but may not be able to return a string.

@m-mohr m-mohr added this to the 1.1.0 milestone Apr 12, 2021
@soxofaan
Copy link
Member

yes something like that is probably enough for now.

Remark about wording though:

In some cases it may not be possible ...
... but may not be able to ...

I think this leaves some question on the table for the user/reader: in which cases is it possible? in which cases not? Is this documented in more detail somewhere? If it doesn't work: is it the "fault" of the user or the backend? Who can/should fix it?

Proposal to make it a bit more concrete and indicate ownership:

Note: it is recommended that the return data type is the same as the date type of the values in the input array data. Not only is this conceptually the most logical usage for reduce_dimension, back-ends might not support data type changes in this context. For example, if data provides you with an array of numbers, you can return a number, but returning a string might be unsupported.

@m-mohr
Copy link
Member Author

m-mohr commented Apr 13, 2021

My intention was to make this as "concise" as possible and leave the full explanation somewhere else (i.e. Data Cube Introduction, links should also be added soon). My explanation felt already too long and yours is now even longer. Ideally, I'd prefer something as short as "Changing the data type may not be supported in all cases, see ..." to not overwhelm in processes, keep them concise and not explain the same thing in 10+ places. Also, I'm not even sure this should be in the processes. The thing is, should a supporting back-end need to remove this or should a non-supporting back-end need to add that to their processes?

@soxofaan
Copy link
Member

yeah good point about being more concise.

short version:

Note: it is recommended that the return data type is the same as the data type of the input array values, as back-ends might not support data type changes in this context.

the thing with

not be supported in all cases,

is that it is not clear what kind of "cases" your refers to: is it about raster cubes versus vector cubes versus raw arrays? Is it about different back-ends? Is it about other operations that are present in my process graph? That's why I would put an explicit reference to "back-end might not support that" in the note, but I suspect you want to keep it a bit more general?

@m-mohr
Copy link
Member Author

m-mohr commented Apr 13, 2021

Yes, the issue is we don't know in advance what may not be supported. It could be unsupported in all cases, but some back-ends may only not support it in certain cases (e.g. you can convert number and boolean, but not strings). I'll try to come up with a PR once we have a full description in the data cube guide.

@m-mohr
Copy link
Member Author

m-mohr commented Apr 13, 2021

ToDo:

  • Decide whether to add a concise warning (i.e. should a supporting back-end need to remove this or should a non-supporting back-end need to add that to their processes?)
  • Add links to the data cube introduction to the related processes

@m-mohr
Copy link
Member Author

m-mohr commented Apr 21, 2021

I've added more links to processes that work on data cubes. The data cube documentation also contains a "note" that describes the relation between callback input and output values.

For now I'll not add warnings to the processes itself until we see in practices that people regularly have issues with such limitations. Until then back-ends can add warnings like the one's proposed here to their process descriptions.

Previously, I also clarified on processes.openeo.org etc that these are just the general specifications and back-ends can derive from it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants