Refactored scanpy.tools._sparse_nanmean to eliminate unnecessary data… #3570

Reovirus · 2025-04-08T14:58:23Z

Fixes #1894. Reduced redundant data copying; the original matrix is now copied once instead of twice. One copy remains necessary to replace NaNs with zeros without modifying the original matrix.

Closes #1894
Tests included
Release notes not necessary because

Reovirus · 2025-04-08T15:51:24Z

@flying-sheep Can you assign the reviewer please? Fall test looks like testing bug

Reovirus · 2025-04-08T17:45:41Z

And I can rewrite logic without any coping using numba, but it can be slowly that implemented methods

Reovirus · 2025-04-09T09:24:46Z

I know that I have "No milestone failure" but I haven't permissions to set milestone, probably I need help of maintainers to set it

Zethson · 2025-04-11T09:07:15Z

@Reovirus I restarted the CI as there should hopefully be no flaky tests at the moment.

Don't worry about the milestone, please.

Reovirus · 2025-04-11T09:56:57Z

failure is scipy.sparse.csr_matrix.count_nonzero which has axis argument since scipy 1.15.0. I'll rewrite with numba

Reovirus · 2025-04-11T14:38:48Z

@Zethson cund you please restart CI? I catch very strange bug with http, it's not mine code

… copying

for more information, see https://pre-commit.ci

… the same time

for more information, see https://pre-commit.ci

Reovirus · 2025-04-11T14:53:31Z

CI pased, can somebody review in some time pls

flying-sheep · 2025-04-11T15:29:50Z

I made the benchmarks run, let’s see how well this works!

Could you please add a release note fragment?

hatch run towncrier:create 3570.performance.md

Reovirus · 2025-04-11T15:35:32Z

Yes, l'll make a note, thanks!)

scverse-benchmark · 2025-04-11T16:11:49Z

Benchmark changes

Change	Before [`813608e`]	After [`7e23b19`]	Ratio	Benchmark (Parameter)
-	10.8±0.2ms	8.80±0.4ms	0.81	preprocessing_counts.FastSuite.time_log1p('pbmc3k', 'counts-off-axis')
-	48.4±3ms	43.0±0.9ms	0.89	preprocessing_counts.time_filter_genes('pbmc3k', 'counts-off-axis')
+	261M	308M	1.18	tools.peakmem_score_genes

Comparison: https://github.com/scverse/scanpy/compare/813608e3af7e25214f4370dabb31b13a5e8159aa..7e23b19d9a0e89d1820c6fb1854c5408a814fed4
Last changed: Wed, 30 Apr 2025 16:08:04 +0000

More details: https://github.com/scverse/scanpy/pull/3570/checks?check_run_id=41426505892

flying-sheep · 2025-04-11T16:38:20Z

according to the benchmarks, this is actually slower than before.

the benchmarks are a bit flaky, so it could be wrong, but usually a factor of 2.5 like here is real.

Reovirus · 2025-04-11T17:47:48Z

Understand. Try to change and speed up

for more information, see https://pre-commit.ci

Reovirus · 2025-04-11T21:43:46Z

I've replace np.add.at by np.bincount (it was slowest place).

Now results is on pictures (old is old functions, new is new implementation, new_parallel is new with some njint in one case. Matrix was (5000, 5000) and I use 100 random matrix (~20% nans, ~20% zeros in one matrix) for testing). X axis like {input_format: csc|csr}{axis: 0|1}{version: old|new|new_with_numba}, y-axis is _sparse_nanmean spent time in seconds on my ubuntu VM (low-resources)

UPD. I've seen benchmark is added automatically

Reovirus · 2025-04-11T23:26:10Z

@flying-sheep
We have a small win (23.4±2ms | 21.8±0.8ms | 0.93 | tools.time_score_genes, too high p-value to display cause stderr is high, implementation is lessly effective on smal-amount-of-zeros data) in score_genes time and small lose in memory. Should I optimize memory usage? (it can be caused by mask=np.isnan(data), instead of thic I can backgroundly recalc mask in np.nan_to_num and reduce memory on 1 array)))

Intron7 · 2025-04-14T06:19:48Z

I think these should be parallel numba kernels with mask arguments for major and minor axis. I have a working implementation in rapids-singlecell that doesnt need to copy at all. Starting from _get_mean_var is I think the best way forward

Intron7

Refactor kernels to be 0 copy

for more information, see https://pre-commit.ci

Reovirus · 2025-04-30T18:11:41Z

@Intron7 I rewrite logics. But my local benchmarking gives another result (I just measure peak memory by ), I'm trying to fix it. And I have no idea about some metrics like preprocessing_counts.FastSuite.time_log1p('pbmc3k', 'counts-off-axis'). It change sign randomly, is it normal?

Reovirus and others added 11 commits April 11, 2025 16:52

Refactored scanpy.tools._sparse_nanmean to eliminate unnecessary data…

acfbd11

… copying

[pre-commit.ci] auto fixes from pre-commit.com hooks

2882948

for more information, see https://pre-commit.ci

rewrite logics with numba (for scipy <1.15.0)

a36b33c

Add types

cdb443b

[pre-commit.ci] auto fixes from pre-commit.com hooks

8021f8c

for more information, see https://pre-commit.ci

correct jit docorators

523cccc

add correct fuction names

968e093

rewrite logics without prange (prange tries to rewrite one element in…

9d06854

… the same time

[pre-commit.ci] auto fixes from pre-commit.com hooks

4087447

for more information, see https://pre-commit.ci

one ptr copy

45688e7

[pre-commit.ci] auto fixes from pre-commit.com hooks

e089438

for more information, see https://pre-commit.ci

Reovirus force-pushed the _sparse_nanmean_is_inefficient branch from 68e1e9c to e089438 Compare April 11, 2025 14:55

add score_genes benchmark

fa3c5c2

flying-sheep added the benchmark label Apr 11, 2025

flying-sheep added this to the 1.11.2 milestone Apr 11, 2025

Reovirus and others added 3 commits April 11, 2025 20:59

some changes

cad1db9

replace np.add.at by np.bincount + add some njint

baa0959

[pre-commit.ci] auto fixes from pre-commit.com hooks

6b3e891

for more information, see https://pre-commit.ci

Intron7 requested changes Apr 14, 2025

View reviewed changes

flying-sheep and others added 6 commits April 14, 2025 14:47

Merge branch 'main' into _sparse_nanmean_is_inefficient

2bb6d2a

Merge branch 'scverse:main' into _sparse_nanmean_is_inefficient

ffee669

add release notes

1c3a67e

[pre-commit.ci] auto fixes from pre-commit.com hooks

2630ee0

for more information, see https://pre-commit.ci

style

7130f25

[pre-commit.ci] auto fixes from pre-commit.com hooks

7e23b19

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactored scanpy.tools._sparse_nanmean to eliminate unnecessary data… #3570

Refactored scanpy.tools._sparse_nanmean to eliminate unnecessary data… #3570

Reovirus commented Apr 8, 2025 •

edited

Loading

Reovirus commented Apr 8, 2025

Reovirus commented Apr 8, 2025 •

edited

Loading

Reovirus commented Apr 9, 2025

Zethson commented Apr 11, 2025

Reovirus commented Apr 11, 2025

Reovirus commented Apr 11, 2025

Reovirus commented Apr 11, 2025 •

edited

Loading

flying-sheep commented Apr 11, 2025 •

edited

Loading

Reovirus commented Apr 11, 2025

scverse-benchmark bot commented Apr 11, 2025 •

edited

Loading

flying-sheep commented Apr 11, 2025

Reovirus commented Apr 11, 2025

Reovirus commented Apr 11, 2025 •

edited

Loading

Reovirus commented Apr 11, 2025 •

edited

Loading

Intron7 commented Apr 14, 2025 •

edited

Loading

Intron7 left a comment

Reovirus commented Apr 30, 2025 •

edited

Loading

Refactored scanpy.tools._sparse_nanmean to eliminate unnecessary data… #3570

Are you sure you want to change the base?

Refactored scanpy.tools._sparse_nanmean to eliminate unnecessary data… #3570

Conversation

Reovirus commented Apr 8, 2025 • edited Loading

Reovirus commented Apr 8, 2025

Reovirus commented Apr 8, 2025 • edited Loading

Reovirus commented Apr 9, 2025

Zethson commented Apr 11, 2025

Reovirus commented Apr 11, 2025

Reovirus commented Apr 11, 2025

Reovirus commented Apr 11, 2025 • edited Loading

flying-sheep commented Apr 11, 2025 • edited Loading

Reovirus commented Apr 11, 2025

scverse-benchmark bot commented Apr 11, 2025 • edited Loading

Benchmark changes

flying-sheep commented Apr 11, 2025

Reovirus commented Apr 11, 2025

Reovirus commented Apr 11, 2025 • edited Loading

Reovirus commented Apr 11, 2025 • edited Loading

Intron7 commented Apr 14, 2025 • edited Loading

Intron7 left a comment

Choose a reason for hiding this comment

Reovirus commented Apr 30, 2025 • edited Loading

Reovirus commented Apr 8, 2025 •

edited

Loading

Reovirus commented Apr 8, 2025 •

edited

Loading

Reovirus commented Apr 11, 2025 •

edited

Loading

flying-sheep commented Apr 11, 2025 •

edited

Loading

scverse-benchmark bot commented Apr 11, 2025 •

edited

Loading

Reovirus commented Apr 11, 2025 •

edited

Loading

Reovirus commented Apr 11, 2025 •

edited

Loading

Intron7 commented Apr 14, 2025 •

edited

Loading

Reovirus commented Apr 30, 2025 •

edited

Loading