-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aggregate_temporal -> labels clarification #19
Comments
There are multiple potential changes in your text:
The changes will be pushed to the repository in the next hours. |
Thanks, one more question: can intervals overlap? |
Yes, I haven't got a reason why we should restrict it, so I added "Intervals can overlap." to the parameter description. Changes are now available in the documentation: https://open-eo.github.io/openeo-api/v/0.4.0/processreference/#aggregate_temporal |
I agree with using the first timestamp of the interval as the default label, but you'd need to resolve conflicts in case many intervals start at the same value (which is possible when we allow overlap). Why do we need the labels anyway? As far as I know we haven't really defined the 'auxiliary data' about the axes and such, so no need to define labels at the moment? As for overlaps, I don't mind having them, but it means that P.S. (not saying we should do it, just a side-note): In my ideal interface, the aggregation operation might look like this: |
For me the labels or target_axis are relevant in the sense that they allow the backend to derive the new discrete time steps that are available after the aggregation, without actually having to run it. Overlaps: some compositing functions use overlapping time windows. |
It's hard to discuss this without a domain model of data cubes' meta-information... But I agree with your suggestion - given that the time steps are already available from the intervals, additional I also agree that dimension should be preserved across aggregation, but its axis should have different partitioning / sampling specification - we shouldn't automatically convert a regularly-sampled axis into interval-partitioned axis without changing some of its meta-attributes. |
We discussed that for 0.4 we stick with the current proposal, but @jdries may just remove the ability to specify string-only labels as GepPySpark can't handle them. Moving the discussion to 0.5 or 1.0. |
Let's revisit this issue: We now have rename_labels. I'm wondering whether we should just remove the labels parameter from aggregate_temporal completely and just handle it as in reduce: We just enumerate the labels by default based on the intervals (i.e. Note to myself: I would need to allow the temporal string types in the |
Note: aggregation assignments to groups are not necessarily unique any longer, see Open-EO/openeo.org@4397fde and #107 (comment). |
Agree to merge and close! |
Currently the definition of the labels parameter in aggregate_temporal is:
One of the most common cases IMO, is that the labels are again valid timestamps. This way, the time dimension is mapped unto a time dimension, and not something entirely new.
Not sure if this should be a hard constraint, but for me it's at least a recommended suggestion. Also an example can be provided.
For me, it's also an option to use the first timestamp of each interval as the default label. That way the labels parameter is no longer mandatory, and the function becomes easier to use.
The text was updated successfully, but these errors were encountered: