[python] Fix enum regression in PR #3647 #3668

nguyenv · 2025-02-04T19:28:15Z

Issue and/or context:

Changes:

When evolving the schema, we need to re-open the array so that on subsequent writes it sees the latest version. Previously this was done in SOMAArray::write. Now that we are replacing this with ManagedQuery::write, the reopening step needs to be moved into this method. There should be no performance degradation as are simply moving the same step from SOMAArray::write to ManagedQuery::write.

Notes for Reviewer:

My original thought was to combine chunks in the Python side to avoid the repeated reopening steps for multiple chunks in the first place, but it sounds like the overhead of creating the copy is worse than reopening.

johnkerl · 2025-02-04T20:01:47Z

I know this PR is in draft so I won't review it -- just a comment while I'm here -- doing the combine_arrays may be a bit of a performance regression, as it's a data-copy. In my experience probably not a large regression since nominally obs is much smaller than X, and var typically even smaller still, so, a little slowdown on obs writes is (I think) unlikely to radically impact end-to-end experiment-write times. Food for thought, though. Especially if we combine your short-term combine_chunks approach here with a longer-term approach that will later allow us to go back and remove the combine_chunks workaround.

bkmartinjr · 2025-02-04T20:18:04Z

sorry for piling on...I think combining the arrays (ie., not being zero copy) will definitely be noticed. IIRC, we added the chunk/batch iteration originally to avoid this overhead when doing large writes, e.g., for the Census builder.

…ite`

codecov · 2025-02-04T21:15:39Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 65.66%. Comparing base (ce0dc82) to head (40f486b).
Report is 2 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #3668       +/-   ##
===========================================
- Coverage   86.27%   65.66%   -20.62%     
===========================================
  Files          55      151       +96     
  Lines        6390    19265    +12875     
  Branches        0     1153     +1153     
===========================================
+ Hits         5513    12650     +7137     
- Misses        877     6220     +5343     
- Partials        0      395      +395

Flag	Coverage Δ
libtiledbsoma	`55.40% <100.00%> (?)`
python	`86.32% <ø> (+0.04%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
python_api	`86.32% <ø> (+0.04%)`	⬆️
libtiledbsoma	`48.25% <100.00%> (∅)`

[c++] Move re-open array from SOMAArray::write to `ManagedQuery::wr…

40f486b

…ite`

nguyenv force-pushed the viviannguyen/sc-62887/python-c-enum-regression-in-pr-3647 branch from 557d143 to 40f486b Compare February 4, 2025 21:11

nguyenv marked this pull request as ready for review February 4, 2025 21:16

nguyenv requested review from johnkerl and bkmartinjr February 4, 2025 21:56

bkmartinjr approved these changes Feb 4, 2025

View reviewed changes

nguyenv merged commit 4013fa2 into main Feb 4, 2025
16 checks passed

nguyenv deleted the viviannguyen/sc-62887/python-c-enum-regression-in-pr-3647 branch February 4, 2025 22:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[python] Fix enum regression in PR #3647 #3668

[python] Fix enum regression in PR #3647 #3668

nguyenv commented Feb 4, 2025 •

edited

Loading

johnkerl commented Feb 4, 2025

bkmartinjr commented Feb 4, 2025

codecov bot commented Feb 4, 2025 •

edited

Loading

[python] Fix enum regression in PR #3647 #3668

[python] Fix enum regression in PR #3647 #3668

Conversation

nguyenv commented Feb 4, 2025 • edited Loading

johnkerl commented Feb 4, 2025

bkmartinjr commented Feb 4, 2025

codecov bot commented Feb 4, 2025 • edited Loading

Codecov Report

nguyenv commented Feb 4, 2025 •

edited

Loading

codecov bot commented Feb 4, 2025 •

edited

Loading