Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batch_load contextmanager and optimize data visibility interactions #1742

Merged
merged 11 commits into from
Oct 26, 2022

Conversation

kecnry
Copy link
Member

@kecnry kecnry commented Oct 14, 2022

Blocked by

Description

This pull request

  • implements a new helper.batch_load contextmanager (currently only used in Imviz)
  • rewrites/optimizes the data menu visibility logic.
  • moves the responsibility of data-linking in Imviz from load_data to the plugin - so that new data that are added will respect the user set linking options and not revert to "pixels". (NOTE: do we need a fallback for the case where the plugin isn't loaded?)
  • deprecates imviz.load_data(do_link=False) with a deprecation warning suggesting using the contextmanager instead.
  • exposes a more friendly API for toggling data-visibility at the app-level via app.set_data_visibility(viewer_reference, data_label, visible=True, replace=False) and updates tests that used the old vue_data_item_visibility (see skip viewer.add_data if already in viewer #1724 (comment))

These were all quite tied together (writing deferred loading into viewer required a change in the API for the data menu), but could possibly be split into separate PRs if anyone thinks that would be useful. From a user-perspective, the change here is in how to defer linking and load multiple images into Imviz and the improved performance of toggling layers in the data menu.

Example use:

imviz = Imviz()

with imviz.batch_load():
    for f in files:
        imviz.load_data(f)
# all data linking and adding to viewers is done AFTER exiting the batch_load block

The performance gains are primarily found with an increasing number of images. For example, loading 5 layers of the Carina data set into Imviz in a loop took ~10s previously. Repeating those same 5 layers took another ~20s. With this PR, those same 5 layers take ~5-7s and does not increase significantly with repeated calls (although the visualization if the viewer is displayed will start to scale with the number of layers). The performance gains by setting WCS linking in advance of loading are even more significant. Note that most of the speed ups were not actually cause by delaying the linking (in the pixel-linking case, at least) but rather by the changes to the logic in the data menu (and by #1724).

Open questions:

  • would we prefer this as one or two (or three) PRs?
  • should Mosviz's app.auto_link logic be migrated to use this internally?
  • If not, should this be moved from the general helper to the Imviz helper?
  • do we want to add support for imviz.load_data(files, ...) (where files is a list of filenames or objects) that makes use of this internally?

Change log entry

  • Is a change log needed? If yes, is it added to CHANGES.rst? If you want to avoid merge conflicts,
    list the proposed change log here for review and add to CHANGES.rst before merge. If no, maintainer
    should add a no-changelog-entry-needed label.

Checklist for package maintainer(s)

This checklist is meant to remind the package maintainer(s) who will review this pull request of some common things to look for. This list is not exhaustive.

  • Are two approvals required? Branch protection rule does not check for the second approval. If a second approval is not necessary, please apply the trivial label.
  • Do the proposed changes actually accomplish desired goals? Also manually run the affected example notebooks, if necessary.
  • Do the proposed changes follow the STScI Style Guides?
  • Are tests added/updated as required? If so, do they follow the STScI Style Guides?
  • Are docs added/updated as required? If so, do they follow the STScI Style Guides?
  • Did the CI pass? If not, are the failures related?
  • Is a milestone set?
  • After merge, any internal documentations need updating (e.g., JIRA, Innerspace)?

@kecnry kecnry added api API change imviz labels Oct 14, 2022
@kecnry kecnry added this to the 3.1 milestone Oct 14, 2022
@github-actions github-actions bot added the embed Regarding issues with front-end embedding label Oct 14, 2022
jdaviz/app.py Outdated
Comment on lines 1421 to 1419
# Sets the plot axes labels to be the units of the most recently
# active data.
if len(viewer_data_labels) > 0 and self._jdaviz_helper._in_batch_load == 0:
active_data = self.data_collection[viewer_data_labels[0]]
if (hasattr(active_data, "_preferred_translation")
and active_data._preferred_translation is not None):
self._set_plot_axes_labels(viewer_id)

self._update_selected_data_items(viewer_id, selected_items)
if self.config == 'imviz':
viewer.on_limits_change() # Trigger compass redraw
Copy link
Member Author

@kecnry kecnry Oct 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to modify this logic slightly to get it to work in the new method. Is the assumption of using the first data layer label valid here? It seems this logic was originally implemented in #191.

Apparently not because it seems to break the axes labeling in Specviz... I'll investigate and fix that before marking this for review.

@kecnry kecnry force-pushed the batch-load branch 4 times, most recently from 4f451ff to f33de66 Compare October 17, 2022 17:01
Comment on lines 66 to 72
components = [str(comp) for comp in msg.data.main_components]
if "ra" in components or "Lon" in components:
# linking currently removes any markers, so we want to skip
# linking immediately after new markers are added (see imviz.helper.link_image_data).
# Eventually we'll probably want to support linking WITH markers, at which point this
# if-statement should be removed.
return
Copy link
Member Author

@kecnry kecnry Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a temporary workaround to mimic the previous behavior where markers handle linking themselves within add_markers and are removed when future calls are made to linking. Follow-up effort to remove this filed as #1749.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pllim - if you have suggestions for a better way to detect if msg.data (which should be the new data collection entry object) is a markers entry as opposed to an image, let me know!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably ok for now, especially given we'll revisit this in the future.

Comment on lines -1008 to +1009
clear_other_data=False):
visible=True, clear_other_data=False):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we instead have visible at the end of kwargs? This is in app, so shouldn't be considered as API-breaking, but if anywhere else relies on the positional order of clear_other_data, this could cause problems.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you stated, it is safer to put visible at the end. Is there a reason why it cannot be after clear_other_data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just think that it intuitively makes sense to be in this order (visible seems more directly tied to the other inputs and action of the method, whereas clear_other_data is more loosely connected and probably less-often to be used). But if API breaking concerns warrants switching the order, I can do that.

@codecov
Copy link

codecov bot commented Oct 19, 2022

Codecov Report

Base: 87.79% // Head: 87.81% // Increases project coverage by +0.02% 🎉

Coverage data is based on head (f7c729c) compared to base (7e61c63).
Patch coverage: 92.72% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1742      +/-   ##
==========================================
+ Coverage   87.79%   87.81%   +0.02%     
==========================================
  Files          95       95              
  Lines       10149    10167      +18     
==========================================
+ Hits         8910     8928      +18     
  Misses       1239     1239              
Impacted Files Coverage Δ
jdaviz/core/template_mixin.py 92.20% <ø> (-0.03%) ⬇️
jdaviz/configs/imviz/helper.py 96.33% <83.33%> (-0.36%) ⬇️
jdaviz/app.py 93.78% <90.90%> (+0.16%) ⬆️
...nfigs/imviz/plugins/links_control/links_control.py 100.00% <100.00%> (ø)
jdaviz/core/events.py 94.02% <100.00%> (+0.09%) ⬆️
jdaviz/core/helpers.py 69.61% <100.00%> (+1.72%) ⬆️
...imviz/plugins/aper_phot_simple/aper_phot_simple.py 91.78% <0.00%> (-0.28%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@kecnry kecnry marked this pull request as ready for review October 19, 2022 21:11
Comment on lines +906 to +905
if data_label in self.data_collection.labels:
warnings.warn(f"Overwriting existing data entry with label '{data_label}'")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related: #1709

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this even possible since this is returned by data_label = self.return_unique_name(data_label) above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is needed to mimic the previous behavior in the test_case_that_breaks_return_label test introduced in #1672

Copy link
Contributor

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moves the responsibility of data-linking in Imviz from load_data to the plugin - so that new data that are added will respect the user set linking options and not revert to "pixels". (NOTE: do we need a fallback for the case where the plugin isn't loaded?)

Is this statement still true? If so, it is a concern because someone might have a image viewer config without that plugin loaded, especially given now Brett Morris has made the config so flexible.

Comment on lines +906 to +905
if data_label in self.data_collection.labels:
warnings.warn(f"Overwriting existing data entry with label '{data_label}'")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this even possible since this is returned by data_label = self.return_unique_name(data_label) above?

Comment on lines -1008 to +1009
clear_other_data=False):
visible=True, clear_other_data=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you stated, it is safer to put visible at the end. Is there a reason why it cannot be after clear_other_data?

f'`load_data` method or similar.')
raise ValueError(
f"No data item found with label '{data_label}'. Label must be one "
"of:\n\t" + "\n\t".join(dc_labels))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"of:\n\t" + "\n\t".join(dc_labels))
f"of:\n\t{'\n\t'.join(dc_labels)}")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted this since it wasn't introduced here (only moved) and your suggestion failed code style checks.

@pllim
Copy link
Contributor

pllim commented Oct 20, 2022

would we prefer this as one or two (or three) PRs?

Since they all seem related, I think one PR is fine.

should Mosviz's app.auto_link logic be migrated to use this internally?

Is there any performance improvement to do this? Even so, can be a follow-up PR?

If not, should this be moved from the general helper to the Imviz helper?

I don't think so. What if we change our mind and want other viz to use it in the future? We don't want to be moving code back and forth.

do we want to add support for imviz.load_data(files, ...) (where files is a list of filenames or objects) that makes use of this internally?

Can be deferred as future PR. Loading list of files as comma-separated string is already supported (did you try batch_load with this use case?).

filelist = data.split(',')

@kecnry
Copy link
Member Author

kecnry commented Oct 21, 2022

moves the responsibility of data-linking in Imviz from load_data to the plugin - so that new data that are added will respect the user set linking options and not revert to "pixels". (NOTE: do we need a fallback for the case where the plugin isn't loaded?)

Is this statement still true? If so, it is a concern because someone might have a image viewer config without that plugin loaded, especially given now Brett Morris has made the config so flexible.

That was kind of my concern as well.... although the new flexible config would arguably work better with this setup since the old logic was in the Imviz helper (not the viewers) and so would need to be reproduced manually by the user (or the helper would need to be subclassed). Here, loading the plugin is now sufficient to apply image linking in any viewer. But if we want to support the use-case of the Imviz helper without the linking plugin, then I can add some logic to check that and just fallback on pixel linking in the helper.

@pllim
Copy link
Contributor

pllim commented Oct 24, 2022

If this introduces the warnings though, then we might want to wait until we can pin the glue-jupyter update?

Maybe safer to wait for upstream. Even if this doesn't "introduce" per-se, looks like it is making the warning more common?

@kecnry
Copy link
Member Author

kecnry commented Oct 24, 2022

ok, I'll move to draft for now and we can make a decision once we get an ETA on the glue-jupyter release timing. Thanks for the reviews everyone!

@kecnry kecnry marked this pull request as draft October 24, 2022 18:16
@pllim
Copy link
Contributor

pllim commented Oct 24, 2022

Trying this out in the imviz_load_3d_slices.ipynb concept notebook + dev glue-jupyter, I have some comments:

  • You will need BUG: Fix baby shark loading in Imviz concept notebook for image slices #1772 for Baby Shark to load properly. Baby Shark does seem more responsive!
  • Your patch fixed the second and third use-case in that notebook. In current main, the slices are not being set to visible on load, which is annoying.
  • However, your patch un-fixed the first use-case. Currently on main, all the slices are set to visible but with your notebook, it only shows the first slice, not all of them. See screenshot below. Is this the behavior change you were talking about? Kinda annoying to have to manually press the "+" several times to see all the slices again.

Screenshot 2022-10-24 143331

@kecnry
Copy link
Member Author

kecnry commented Oct 24, 2022

@pllim - yes those are the changed behaviors I noticed as well. I agree that the expected behavior probably would be to always have all the slices enabled, by default, although I am honestly not sure why the changes here would revert that behavior in the last case (although I didn't dig too deeply into the code logic since I wasn't sure the expected behavior or if that was an officially supported use-case).

@pllim
Copy link
Contributor

pllim commented Oct 24, 2022

In Imviz, I don't think a user would ever want to load something and not have it blink-able. 🤔

although I am honestly not sure why the changes here would revert that behavior in the last case

It is also curious that in main, it now behaves the opposite. So that use case somehow is using reverse logic from the other two. 🤯

@kecnry kecnry force-pushed the batch-load branch 2 times, most recently from cd74915 to 3ce9a81 Compare October 25, 2022 13:13
@kecnry
Copy link
Member Author

kecnry commented Oct 25, 2022

@pllim - I think I have fixed the cases in the concept notebook (by delaying the loading into viewer for any new label that results from a call to load_data, rather than expecting just a single new label). Please let me know if it behaves as you'd expect now.

@pllim
Copy link
Contributor

pllim commented Oct 25, 2022

@kecnry , I can confirm that your updated PR fixed all 3 use cases of loading slices into Imviz in the concept notebook.

For fun, I tried to load more frames from the GIF into Imviz (66 slices!) by changing % 10 to % 2 in the code. I still get "IOPub message rate exceeded" even with the "performance fix" in glue-jupyter. Blinking is real laggy, and I broke your labeling (screen show below). Fixing this is out of scope; I just thought you might be interested to know...

Screenshot 2022-10-25 123059

Copy link
Contributor

@pllim pllim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't grok all the diff but seems to work for Imviz. Baby Shark is happy.

kecnry and others added 11 commits October 26, 2022 13:55
* plugin listens for new data added to DataCollection, if within batch_load, linking is skipped.  
* plugin also listens to exiting batch load and forces data to re-link to the plugin settings.
* add_markers has its own linking logic and calls to linking will currently erase any existing markers.  Until that logic is changed/fixed, we want to skip linking immediately after adding markers to avoid them being removed as soon as they're added.
* to get performance updates from minimizing unnecessary messages/events
@kecnry
Copy link
Member Author

kecnry commented Oct 26, 2022

RTD failure is just because the build started before the pip release of glue-jupyter was available (but I can re-force push to force a rebuild if anyone thinks necessary - it was passing before bumping glue-jupyter though)

@kecnry kecnry marked this pull request as ready for review October 26, 2022 18:23
@pllim
Copy link
Contributor

pllim commented Oct 26, 2022

I restarted RTD build. FYI 🤞

@kecnry kecnry merged commit 56de4f8 into spacetelescope:main Oct 26, 2022
@kecnry kecnry deleted the batch-load branch October 26, 2022 18:32

[data] = [x for x in self.data_collection if x.label == data_label]
color_cycler = ColorCycler()
viewer.add_data(data, percentile=95, color=color_cycler())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To do/fix soon: this implementation always produce the first color in the cycler, for all data. I'll take a look!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api API change embed Regarding issues with front-end embedding imviz Ready for final review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants