Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dask: Data collapse methods and functions #356

Merged
merged 43 commits into from
Apr 20, 2022

Conversation

davidhassell
Copy link
Collaborator

Another hefty PR! Hopefully self explanatory, but these notes might be useful:

  • Added Data.square and Data.sqrt because they were useful.
  • Made some changes to Data.stats and Data.mean_of_upper_decile before realising that they can't be tested yet. So these will be dealt with in separate PRs. Feel free to ignore these methods.
  • Needs dask v2022.03.0 which should be out later today :)
  • Tidied up (i.e. removed) some redundant MPI stuff.

@davidhassell davidhassell marked this pull request as draft March 18, 2022 11:02
@davidhassell
Copy link
Collaborator Author

Hang on - forgot a test ....

@davidhassell
Copy link
Collaborator Author

OK - good to go.

@davidhassell davidhassell marked this pull request as ready for review March 18, 2022 13:18
@davidhassell
Copy link
Collaborator Author

Another note:

  • I got rid of Data._map_blocks because a) it wasn't used very often, and b) to use it you really have to check what it does internally, and since it's only ~3 lines it made sense to me to in-line it.

@davidhassell davidhassell added the dask Relating to the use of Dask label Apr 1, 2022
Copy link
Member

@sadielbartholomew sadielbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm mid-way through reviewing and everything seems good so far. Thought I'd raise some typos in the meantime, and if you can address the conflict resolutions that would help me to ensure I am reviewing the up-to-date code. Thanks.

Copy link
Member

@sadielbartholomew sadielbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is fantastic. The updated code is well-structured and even copes sensibly across all of the statistics (bar those two methods you say are out-of-scope for this PR) for the slightly evil case of a fully-masked array. Great commitment to thorough documentation, too.

I've posed a few questions on minor aspects and suggested fixes for inevitable typos, one of which is in the functional code so is quite crucial to update. Thanks.

davidhassell and others added 13 commits April 13, 2022 15:43
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
Co-authored-by: Sadie L. Bartholomew <[email protected]>
@davidhassell davidhassell merged commit 55cfa9e into NCAS-CMS:lama-to-dask Apr 20, 2022
@davidhassell davidhassell deleted the dask-collapse branch November 15, 2022 09:19
@davidhassell davidhassell added this to the 3.14.0 milestone Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dask Relating to the use of Dask
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants