Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on read of backed-mode X with scipy==1.15.0rc1 #1802

Closed
2 of 3 tasks
johnkerl opened this issue Dec 15, 2024 · 2 comments · Fixed by #1806
Closed
2 of 3 tasks

Crash on read of backed-mode X with scipy==1.15.0rc1 #1802

johnkerl opened this issue Dec 15, 2024 · 2 comments · Fixed by #1806

Comments

@johnkerl
Copy link

johnkerl commented Dec 15, 2024

Please make sure these conditions are met

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of anndata.
  • (optional) I have confirmed this bug exists on the master branch of anndata.

Report

Code:

import anndata as ad
adata = ad.read_h5ad("pbmc-small.h5ad", "r")
print(adata.raw.X[:10])

Traceback:

Traceback (most recent call last):
  File "/tmp/x", line 3, in <module>
    print(adata.raw.X[:10])
          ~~~~~~~~~~~^^^^^
  File "/Users/kerl/.pyenv/versions/3.11.9/lib/python3.11/site-packages/anndata/_core/sparse_dataset.py", line 431, in __getitem__
    sub = self.to_memory()[row_sp_matrix_validated, col_sp_matrix_validated]
          ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kerl/.pyenv/versions/3.11.9/lib/python3.11/site-packages/scipy/sparse/_index.py", line 30, in __getitem__
    index, new_shape = self._validate_indices(key)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kerl/.pyenv/versions/3.11.9/lib/python3.11/site-packages/scipy/sparse/_index.py", line 274, in _validate_indices
    idx = self._asindices(idx, N)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/kerl/.pyenv/versions/3.11.9/lib/python3.11/site-packages/scipy/sparse/_index.py", line 316, in _asindices
    max_indx = x.max()
               ^^^^^^^
  File "/Users/kerl/.pyenv/versions/3.11.9/lib/python3.11/site-packages/numpy/core/_methods.py", line 41, in _amax
    return umr_maximum(a, axis, None, out, keepdims, initial, where)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: '>=' not supported between instances of 'int' and 'NoneType'

Versions

>>> import anndata, session_info; session_info.show(html=False, dependencies=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/kerl/.pyenv/versions/3.11.9/lib/python3.11/site-packages/session_info/main.py", line 209, in show
    mod = sys.modules[mod_name]
          ~~~~~~~~~~~^^^^^^^^^^
KeyError: 'distributed'

This is with anndata==0.11.1 and scipy==1.15.0rc1. I know you haven't done a release with the latter as a dependency -- please consider this a heads-up.

@johnkerl johnkerl changed the title Crash on read of backed-mode X with scopy Crash on read of backed-mode X with scipy==1.15.0rc1 Dec 15, 2024
@flying-sheep
Copy link
Member

Scanpy also runs into this problem.

The issue is in BaseCompressedSparseDataset.__getitem__:

  1. call it with slice(0, 200, None), slice(None, None, None) (dataset[0:200, :])
  2. validation converts that to row_sp_matrix_validated = (slice(0, 200, None), slice(None, None, None)); col_sp_matrix_validated = (200, 10)
  3. hit this line:
    sub = self.to_memory()[row_sp_matrix_validated, col_sp_matrix_validated]
  4. scipy raises a TypeError

the validator’s output doesn’t look right, it looks like it converts both indices to 2D indices for some reason

@flying-sheep
Copy link
Member

OK, seems like we’re using scipy’s internal validator, which changed in 1.5 to return a tuple of idx, shape, i.e. a tuple[tuple[int, int], tuple[int, int]]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants