Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dask instead of numpy masked arrays #58

Closed
pnuu opened this issue Nov 28, 2018 · 4 comments
Closed

Use dask instead of numpy masked arrays #58

pnuu opened this issue Nov 28, 2018 · 4 comments
Assignees
Labels
enhancement PCW Pytroll Contributers Week

Comments

@pnuu
Copy link
Member

pnuu commented Nov 28, 2018

Currently PySpectral uses masked arrays from Numpy to handle the data arrays. For better integration and SatPy support, these should be converted to dask arrays.

@pnuu pnuu added enhancement PCW Pytroll Contributers Week labels Nov 28, 2018
@pnuu pnuu self-assigned this Nov 28, 2018
@djhoese
Copy link
Member

djhoese commented Nov 28, 2018

As mentioned on slack, there has to be a decision made about this. Does pyspectral add a hard requirement on dask? Most numpy operations should work with dask, at least the arithmetic and ufunc ones. If pyspectral can be updated and assume that NaN is used as a fill value instead of depending on masks then it should be compatible with both in most cases.

@pnuu
Copy link
Member Author

pnuu commented Nov 28, 2018

Yes, I agree. I noticed this after starting to convert the unit tests to the daskified version, and found out that there are soooo many use-cases that don't work with my dask conversion. SatPy was happy, but I need to think more if this is feasible for other (non-SatPy) use(r)s.

@pnuu
Copy link
Member Author

pnuu commented Nov 28, 2018

Running a full disk scene of SEVIRI with SatPy to create day_microphysics composite takes 7.4 seconds, and with the fully daskified version 4.8 s. I think we need to find a way where all the use cases can still work.

@pnuu
Copy link
Member Author

pnuu commented Nov 29, 2018

OK, found the way to do it: only daskify the NIR reflectance part, and adjust the documentation so that it doesn't have any references to masked arrays. If someone has used the reflectance computations without SatPy, there's a need to adjust the calling script to a) to do .compute() to get the result, and b) use np.isnan(arr) instead of arr.mask to get the invalid values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement PCW Pytroll Contributers Week
Projects
None yet
Development

No branches or pull requests

3 participants