remove unecessary copy of large data #995

bpinsard · 2023-11-01T20:44:05Z

Changes proposed in this pull request:

remove copy of the data that seems unnecessary, cuts memory usage by ~30%
data is not modified afterward, is copied through subsetting in the following lines (362).

investigation of memory usage originates from nipreps/fmriprep#3125

codecov · 2023-11-02T13:44:16Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (b139386) 89.54% compared to head (a0c9aae) 89.54%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #995      +/-   ##
==========================================
- Coverage   89.54%   89.54%   -0.01%     
==========================================
  Files          26       26              
  Lines        3396     3395       -1     
  Branches      619      619              
==========================================
- Hits         3041     3040       -1     
  Misses        207      207              
  Partials      148      148

Files	Coverage Δ
tedana/decay.py	`93.79% <ø> (-0.05%)`	⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

effigies · 2023-11-06T16:01:13Z

@tsalo It looks like you added this .copy() in #468, but there doesn't seem to be any discussion that indicates it's necessary. My reading is that all downstream uses should be reading, not writing this, so I think this PR is a sensible optimization, but I'm definitely not well-versed in Tedana code.

tsalo

Good catch. Thanks @bpinsard.

@effigies you're right, data isn't modified in place in that function, so there's no need to copy it. Not sure why I added the copy.

handwerkerd

Thank you for noticing and correcting this!

This feels like something where the edited data was being passed out of the function, but that's not happening now and going through past versions I don't see where that might have been happening.

bpinsard · 2023-11-06T17:05:59Z

Thanks for your feedback! It was an easy catch when I quickly mem-profiled the code, the use of float64 is definitely making the workflow memory intensive. I guess there could be further optimization.
Is float64 used only for scipy.optimize.curve_fit? if so maybe converting to float64 could be differed to when it's called.

handwerkerd · 2023-11-06T19:17:49Z

Barring essential needs, I'd lead towards keeping things as float64 because the code is doing fits and divisions in multiple places and rounding the final bit of information has been shown to non-trivially reduce precision. Having float64 means are data are reliably stable to precision above float32. There is alowmem option that I don't particularly like, but maybe we can change that to run with float32.

FWIW, there's another standing memory issue ( #856 ). It's possible this fix addresses that issue, but we also identified another place where we can slightly reduce memory needs (see final comment). I'm not sure how much more memory that would save, but probably still worth implementing.

Also, this PR has two approvals and can be merged. @bpinsard as the person who created the PR you are welcome to hit squash and merge

bpinsard · 2023-11-06T20:41:45Z

Barring essential needs, I'd lead towards keeping things as float64 because the code is doing fits and divisions in multiple places and rounding the final bit of information has been shown to non-trivially reduce precision. Having float64 means are data are reliably stable to precision above float32. There is alowmem option that I don't particularly like, but maybe we can change that to run with float32.

I think #856 mainly originates from a too low estimates of memory reqs passed to nipype in fmriprep. I hit such errors even with 45G of RAM. Maybe if someone familiar with the tedana internals know how many times the data get transformed/duplicated we could try to figure out a better heuristics. Though the current PR might bring the estimate to what is required.

I don't think I have permissions to merge PRs on that repo.

handwerkerd · 2023-11-16T19:25:57Z

@bpinsard I'm working on a tedana abstract for OHBM 2024. People who contributed in the last year are welcomed as co-authors. If you want to be a co-author, let me know so that I can make sure I have appropriate info to include.

tsalo · 2023-11-22T15:17:17Z

@all-contributors please add @bpinsard for code.

allcontributors · 2023-11-22T15:17:26Z

@tsalo

I've put up a pull request to add @bpinsard! 🎉

handwerkerd · 2024-01-18T18:49:29Z

@all-contributors please add @bpinsard for code.

allcontributors · 2024-01-18T18:49:32Z

@handwerkerd

@bpinsard already contributed before to code

bpinsard force-pushed the fix/mem_usage branch from 2b93e7d to 43b2cf8 Compare November 2, 2023 12:57

remove uncessary copy of large data

a0c9aae

bpinsard force-pushed the fix/mem_usage branch from 43b2cf8 to a0c9aae Compare November 2, 2023 13:25

tsalo approved these changes Nov 6, 2023

View reviewed changes

tsalo requested a review from handwerkerd November 6, 2023 16:15

handwerkerd approved these changes Nov 6, 2023

View reviewed changes

handwerkerd merged commit 2cb05ab into ME-ICA:main Nov 6, 2023

bpinsard deleted the fix/mem_usage branch November 6, 2023 21:14

effigies mentioned this pull request Nov 6, 2023

Memory issue reported with nonlinear T2* estimation in fMRIPrep #856

Open

allcontributors bot mentioned this pull request Nov 22, 2023

docs: add bpinsard as a contributor for code #1003

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove unecessary copy of large data #995

remove unecessary copy of large data #995

bpinsard commented Nov 1, 2023

codecov bot commented Nov 2, 2023

effigies commented Nov 6, 2023

tsalo left a comment •

edited

Loading

handwerkerd left a comment

bpinsard commented Nov 6, 2023

handwerkerd commented Nov 6, 2023

bpinsard commented Nov 6, 2023

handwerkerd commented Nov 16, 2023

tsalo commented Nov 22, 2023

allcontributors bot commented Nov 22, 2023

handwerkerd commented Jan 18, 2024

allcontributors bot commented Jan 18, 2024

remove unecessary copy of large data #995

remove unecessary copy of large data #995

Conversation

bpinsard commented Nov 1, 2023

codecov bot commented Nov 2, 2023

Codecov Report

effigies commented Nov 6, 2023

tsalo left a comment • edited Loading

Choose a reason for hiding this comment

handwerkerd left a comment

Choose a reason for hiding this comment

bpinsard commented Nov 6, 2023

handwerkerd commented Nov 6, 2023

bpinsard commented Nov 6, 2023

handwerkerd commented Nov 16, 2023

tsalo commented Nov 22, 2023

allcontributors bot commented Nov 22, 2023

handwerkerd commented Jan 18, 2024

allcontributors bot commented Jan 18, 2024

tsalo left a comment •

edited

Loading