Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dask: Data.unique #391

Merged
merged 5 commits into from
Apr 27, 2022
Merged

Conversation

davidhassell
Copy link
Collaborator

No description provided.

Copy link
Member

@sadielbartholomew sadielbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good and ready to merge, with a nice extension to the unit test, though if I was to be harsh I might suggest first including also in the unit test a small block to cover the various data type possibilities ("fibUS" as set here) and in particular the case of a string-type data array, an example I checked manually in reviewing, since really we are only checking integer arrays at present?).

@davidhassell
Copy link
Collaborator Author

Harsh is fine! Data type unit tests: ae13c20

Copy link
Member

@sadielbartholomew sadielbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bonus points for adding the new commit, which I have sanity checked and has added my suggestion in an elegant way. All good to merge.

@davidhassell davidhassell merged commit bcbd03a into NCAS-CMS:lama-to-dask Apr 27, 2022
@davidhassell
Copy link
Collaborator Author

For future reference - the da.unique function was not used in this PR, we used da.reduction instead, because the former does not deal very well with missing data. I had a look at modifying da.unique to cope, but a) it was not trivial; b) cf.Data doesn't need (yet) the return_* keywords; and c) the dask folk might change it to be a reduction at some stage (dask/dask#2851). For these three reasons, it was preferable at this time to implement a missing-data-friendly "unique", without the return_* keywords as a standard cf.data.Collapse method.

@davidhassell davidhassell deleted the dask-unique branch November 15, 2022 09:27
@davidhassell davidhassell added this to the 3.14.0 milestone Nov 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dask Relating to the use of Dask
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants