-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor multi-model statistics code to facilitate ensemble stats and lazy evaluation #949
Conversation
Separate the ESMValCore internals, dealing with products, from the core function operating on cubes. This makes it easier to add a new preprocessor for ensemble statistics, and also to make a new core function (lazily) compute statistics across cubes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @Peter9192 , all tests are passing, so it looks good to me! Separating how products are processed from cubes will make the code much easier to work with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cheers @Peter9192 for this! I am still to read the changes to the documentation, the functionality seems to be nicely restructured, but am not quite happy with the docstring changes - they have become too technical and I don't think this is a good idea, given who we cater for 👍 EDIT I have now gone through the documentation changes and I like'em, job well done there - but still confused as to why multimodel
-> multi-model
🍺
Thanks @valeriupredoi I'll look into your suggestions! I appreciate the point about the audience for the docstrings. |
return statistics_products | ||
|
||
|
||
def multi_model_statistics(products: set, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this function is part of the public API. Changing it so it no longer accepts cubes is a breaking change that will probably break diagnostic code. Would it be possible to make it work again that if you put cubes in, you get cubes out? And of course the docstring needs to be good as @valeriupredoi already pointed out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shucks! I totally missed that - very good point @bouweandela - yes, @Peter9192 please see this and revise the approach 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I realized this as well but I'm struggling to find an appropriate solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Peter9192 Didn't we talk about making multi_model_statistics
cube only (not to mess with the public API) and simply pass the arguments to multi_cube_statistics
. Then use _multi_model_statistics
with products for internal use? Or am I confusing 2 PRs now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a lot of contemplating I have accepted that the public API function will have to accept both products and cubes as possible inputs. I had rather hidden those products from the public altogether, but this becomes very ugly very quickly. So I have now restored the public multi_model statistics
as the single entry point that returns different functions based on its input type.
as entry point for both multi-cube and multi-product statistics. This is to maintain the alignment between the recipe API and the public preprocessor documentation.
All green! @valeriupredoi I think I got everything covered, could you please have another look? Thanks 🍻 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, development tests pass too (not cached ones on Circle, actual branch and ESMValTool dev ones), cheers @Peter9192 🍺 @bouweandela did you want to have another look yerself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, more clean. Just a minor comment on the unit tests.
Now the test fails because of iris docs change or smt like that 😞 |
That should be fixed by #964 |
Description
This PR refactors the multimodel statistics code. Specifically, it separates the core functionality (computing statistics across cubes) from the intermediate layer that deals with the handling of ESMValCore products.
This facilitates re-use of the core function in e.g. an ensemble statistics function (see #673) and also it makes it easier to make the core functionality lazy (see #781 and #950).
Before you get started
Checklist
pre-commit
oryamllint
checksIf you make backwards incompatible changes to the recipe format:
To help with the number pull requests: