-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
replace xarray median and mean with numpy #419
replace xarray median and mean with numpy #419
Conversation
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #419 +/- ##
=======================================
Coverage 99.83% 99.83%
=======================================
Files 22 22
Lines 1219 1219
=======================================
Hits 1217 1217
Misses 2 2 ☔ View full report in Codecov by Sentry. |
Thanks for drawing attention to this @Kasra-Shirvanian. This PR prompted me to time the execution of numpy vs xarray ways of computing the mean and median on my machine. I found that you are right, numpy is faster, by a factor ranging from ~1.4 to ~5 (at least on the small dataset we use in this example). Some outputs from my tests (run with IPython on an M2 silicon Mac, reported time is average of 1000 executions). xarray mean: 0.094241 s
numpy mean: 0.066624 s
numpy is ~1.41 times faster
xarray median: 0.235287 s
numpy median: 0.094634 s
numpy is ~2.49 times faster For some context, the purpose of the examples is mainly educational, and part of their aim is to get our users familiar with That said, the case you've highlighted only concerns 1D data, and perhaps there is value in also using
You are right about that. All in all I'm happy to merge this PR. feel free to mark it as "Ready for review" despite the failing action. That failure is unrelated to this PR. |
Actually, just re-running the failing github action fixed the problem. There was a problematic URL that now seems to be happy again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is ready to merge from my point of view @Kasra-Shirvanian. Congrats on your first contribution to movement
🎉
c51ed75
This PR improves the performance of an example script by replacing xarray's mean() and median() functions with numpy's nanmean() and nanmedian(). The xarray functions apply reductions iteratively over subbranches, making them inefficient for simple statistical calculations. Additionally, the script was redundantly computing mean and median values twice (once for the plot and again for the legend).
Description
What is this PR
Why is this PR needed?
This change does not affect the main functionality of the toolbox but significantly improves the performance of the example script. It provides a better user experience for those trying out the examples or adapting them to their own datasets, leading to a better impression of the project.
What does this PR do?
Replaces xarray's mean() and median() with numpy.nanmean() and numpy.nanmedian().
Removed redundant calculations in the legend.
How has this PR been tested?
It has been tested on my local machine
Performance Improvement:
Before: Median computation took 13.41 seconds
After: Median computation now takes 0.00025 seconds
Checklist: