-
Notifications
You must be signed in to change notification settings - Fork 627
Fix tests for dask PCA #3162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix tests for dask PCA #3162
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3162 +/- ##
=======================================
Coverage 76.52% 76.52%
=======================================
Files 109 109
Lines 12483 12483
=======================================
Hits 9553 9553
Misses 2930 2930 |
Sorry what is the provenance of this fix? Why isn't the fix something in AnnData? This seems somewhat complicated |
Yeah, I was concerned about the complexity. But AnnData doesn’t do anything wrong, it just uses chunks in both directions, which that one specific algorithm doesn’t support. But I think in general we should test for 2D chunks, that’s why I think AnnData’s test helpers work as intended. Any idea how to do this more simply? |
@flying-sheep I mean why this came up all of a sudden in terms of provenance. And furthermore, instead of |
I’m not sure why it came up now, I saw changes to the chunking in scverse/anndata#1550 and assumed that introduced it, but maybe not. regarding complexity and parameters, we could simplify things by doing def maybe_rechunk_1d(a: NDArray | DaskArray | ...) -> NDArray | DaskArray | ...:
if isinstance(a, DaskArray):
return a.rechunk((a.chunksize[0], -1))
return a and using that in all the functions that fail (after running things through |
41b8750
to
2baf825
Compare
I simplified this quite a bit. I decided against blocking this on an upstream
|
Co-authored-by: Philipp A <[email protected]>
Fixes the current test failures.
I could also do a simpler version where I just rechunk in one way instead of adding parameters here.