-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to standardize to anomalies preprocessor function #300
Conversation
Can I have a quick review (by @jvegasbsc or @valeriupredoi or ... ?) before proceeding with writing a unit test and adding it to the documentation? |
Looks nice, only that I think it will be better to have it as an option in the anomalies preprocessor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice one! a couple minor comments. Also, a bit confusing to me (as a non-scientist) as to why monthly, month, mon -> maybe detail this in the func docstring, defo needed in the documentation
This is equally confusing to me, my function inherits this from the other functions. Maybe @jvegasbsc can explain this? |
Yes, I agree. That would be an option as well. What do others think? |
I also prefer extending existing preprocessors rather than creating new ones. |
To support all common abbreviates used for monthly data |
@jvegasbsc I just included it as an option in the existing anomaly function. Could you test it? |
Co-Authored-By: Bouwe Andela <[email protected]>
I added a unit test for only one case, at the moment, as a proof of concept. I look forward to some feedback, maybe from @bouweandela or @jvegasbsc ? The strategy I took is to calculate the expected outcome by using I do think that trying to put more functionality into one function (as suggested by @mattiarighi and @jvegasbsc above) in general renders unit testing way more complex (since the options that need to be covered equal the |
I modified it to simplify it a bit and corrected a couple of flake8 issues. Anyway, my suggestion will be to create a new method for testing the standardized case. I think it will be also a good idea to generate a new data cube that will make easier to know at a glance what the result should be |
Indeed this is only feasible if the number of possible input values is small. Parametrizing with pytest helps a bit, but not when the list of options/possible values grows very large. Usually people try to test at the very least every code path, have a look at the coverage report in test-reports/coverage_html/index.html and make sure there is no code that is not executed during a test (marked red). |
Just a small update (also as a note to self). I got time again to work on this :) I did a first implementation solely for the case period='full'. Interestingly, the differences between |
there is no weighting since |
By default |
here try this with a say, temperature cube:
I get a mean delta of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! Just a few minor suggestions on formatting
Co-Authored-By: Bouwe Andela <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. @mattiarighi Could you please test?
Can this please be merged? @mattiarighi ? |
I added a new preprocessor function, making use of two other existing preprocessor functions. This is the description I provided in the docstring:
As far as I could see it is not possible to achieve this behaviour by combining existing preprocessor functions in the recipe, therefore I needed to add this as a new preprocessor function.
Tasks
Closes #299