-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide an update_display message #209
Comments
Alternative - top level key: {
"content": {
"data": {
"text/plain": "[== ]"
},
"display_id": "0000-1111-2222-3333"
}
} This makes it so the |
Another suggestion from @minrk: we could have another entry in content for all transient/live data: {
"content": {
"data": {
"text/plain": "[== ]"
},
"transient": {
"display_id": "0000-1111-2222-3333"
}
}
} |
The jupyter js widgets currently use transient data to get bootstrapped on the page by setting the model id in a mimebundle {
"application/vnd.jupyter.widget": model_id
} The transient section seems like a good use case for those as well. |
I like not persisting the display-id, so the top-level key or transient area makes sense to me. These don't really make sense as persisted info (much like it doesn't make sense to persist widget model IDs in the document). Related details from side conversation:
Further questions:
|
Now that I'm thinking about model IDs for jupyter widgets, it would be handy to have the transient/live dict/map available in general on all the messages that have mime bundles. |
Well that one leaves me a bit stumped. I'm not sure what the desired behavior is. |
I think it depends a bit on how we define the semantics of update. Do we allow publishing deltas, or is it always replacing the whole thing? If it's always replacing the whole thing, then it seems like the simplest modification would be a regular display_data message with the following modifications to handling: if display_id is defined:
if display_id found on the page:
update match in-place
else:
regular display, record display_id for future lookup
else:
regular display |
Ah, I like that. I'm definitely thinking of replacing the whole thing. Doing it on the same I feel like deltas belong in comms or another message type. |
Yup, there's always comms for the complex case. At the protocol level, we could already have this, because each display_data already has an id: the msg_id. But in practice this is difficult to locate, as it's not typically available in the kernel's APIs. |
While this repo only defines protocols, it might be worth considering what IPython APIs could look like for updating an output, since you would have to create and track an ID somehow. Sending raw messages is always an option, but presumably we want to cover at least some use cases without that. |
The reason of course that I can't necessarily def set_parent(self, parent):
"""Set the parent header for associating output with its triggering input"""
self.parent_header = parent
self.displayhook.set_parent(parent)
self.display_pub.set_parent(parent)
if hasattr(self, '_data_pub'):
self.data_pub.set_parent(parent)
try:
sys.stdout.set_parent(parent)
except AttributeError:
pass
try:
sys.stderr.set_parent(parent)
except AttributeError:
pass |
Thinking on the IPython API for this: bundle = {
'text/html': '<b>whoa</b>',
}
IPython.display.display(bundle, raw=True, transient={'id': '1234'}) |
I would imagine we would also want a higher level API, which I envisage something like this: from IPython.display import display, HTML, Updatable
d = Updatable(HTML('<b>whoa</b>'))
display(d)
# ...
d.replace(HTML('<i>awesome</i>')) |
❤️ Ah, I'm liking that. Could have a traitlets based version too: updatable.data = HTML('<i>awesome</i>') |
I thought about that interface too, but if I'm doing something primarily for the side effect I prefer it to be a function/method call rather than attribute/item assignment. |
I prefer the function/method call as well. |
I have a prototype implementation working: and came up with a few questions in the meantime. The main one is: Should That determines where the display_id information might be stored/looked up during the display-formatter process. I'm currently inclined to say no, that objects can't define and store their display_id such that a naked Another question is to decide between two semantic behaviors:
The key difference for me is what should happen if someone thinks they are updating something, but the original display has been cleared. If the answer is that nothing should be displayed, then explicit update seems like the way to go. If it should get displayed, then it seems like display-with-id is the right choice. For the first case, I would think the implementation would be: display(obj, display_id='xxx')
display(obj_again, display_id='xxx') where the protocol change is adding the which for me is "Put me on the page, but don't make duplicates". The second might look like: display(obj, display_id='xxx')
update_display(obj_again, display_id='xxx') and the protocol change is the same as above, also adding an which is more "1. put me on the page, no matter what, 2. update me if I'm on the page, otherwise do nothing" I can honestly imagine both being surprising, depending on user expectations, but I have a slight leaning toward the first one. |
What usecases are we envisioning for updating output? So far, I see in the discussion above:
I could also imagine simple live charting for something like matplotlib. Also, it seems that the discussion is about updating any display area on a page. Another thought that satisfies the progress bar case is to scope updates to messages with the same parent header, so that the display id namespace is scoped to a single execution request. We do run into issues with how parent headers are treated in threaded computations, but if we solve that, it feels a bit cleaner. You could use an output widget for the more advanced usecase of updating outputs on other areas of the page. |
Beyond live charting, there's also the simpler precursor: updating a single plot. |
For both the progress bars and updating a single plot, I lean towards the display/update-only interaction. Especially if the output has been cleared - I don't want to keep seeing something pop back up. And especially not popping up in my current cell output when the original one was a few cells away. In what situations does your situation one, the 'display without duplicates' make sense? |
I don't think so. If you need to do this, you just (1) keep the reference to the displayed object (2) update that reference and (3) re-display, right?
👍 from me. To be clear, what happens if you make multiple Is it OK to reuse the |
In jupyter/jupyter#212 (comment), I mentioned @mariusvniekerk and I would comment on options for a better Spark progress bar in the spylon library. We discussed the message and APIs proposed here today, and both agree they look like a nice way to render richer progress information, regardless of the final implementation chosen. Great stuff. 👍
From a user's standpoint, the answer might depend on the context. For example, say I kick off a long running Spark job using a library capable of showing progress during execution. Assume my kernel remains busy throughout. If I clear the notebook output during execution, I'd like the progress bar to reappear on the next update. My Spark job is still running and my kernel is still busy, and so I'd like my notebook display to reflect that fact. Now imagine I'm using a library that updates plots of streaming data using a background thread in the kernel. If I clear my cell outputs in this scenario, I do not want my plots to reappear again. I'm using clear as a convenient way of stopping all plot updates instead of having to both clear and run an additional cell to tell the library to stop making updates. I think combining @minrk's two API options above can satisfy the two use cases. In the progress bar scenario, a library would use:
repeatedly to create, update, or recreate the progress bar view whether the user clears the notebook or not. In the plotting scenario, a library would use:
to create the initial display area for the plot, and thereafter only update that display for as long as it exists. |
On a related note, having
The third step requires use of comm messages and reasonably complex libraries wrapping them today (e.g., ipywidgets, declarativewidgets). I imagine use of |
Excellent user/developer story! It looks like the third step doesn't require comm messages - only bookkeeping around display IDs. |
I don't think it will work to scope updates to a single parent even for progress bars, because we are already seeing progress bars that are updated in the background in places like dask. If we do scope it to parents in the protocol, that really means that we need to track parents in the kernel so they can be swapped out to update, which seems more complicated than tracking display-id updates in the frontend.
The So the current proposal seems to be:
A progress bar API would require using methods in order to track display_id: class ProgressBar(object):
def display(self):
self._display_id = new_id()
display(self, display_id=self._display_id)
def update(self):
update_display(self, display_id=self._display_id) Another question:
|
I'd like #2 to be "equivalent to an update" to handle the progress bar use case I mentioned above. Without this behavior, the library would need to detect that the display was cleared, generate a new display ID, and then go back to updating it. Libraries that don't want the automatic recreation behavior can use |
@parente sounds good to me. |
How are you sending display data with a transient dict from the IPython side? |
PRs for IPython implementation: ipython/ipykernel#204 ipython/ipython#10048 |
Detail that hasn't been brought up yet: |
With the implementation in the PR for nteract, it would be super easy to do this for |
I think it's the same for nbclassic. It's probably on the IPython side where it would be hardest to assign a display_id to a result. |
@RgBrk do you think that this protocol could be implemented on the top of comms? |
Anything can be implemented on top of the comms with the right receiver on the frontend side. This provides a path for one way updates that are persisted to the notebook that any kernel author can implement in a fairly simple fashion. |
Mentioned in chat with Min:
|
#225 is ready for review |
I recently implemented a Progress Monitor for Spark Jobs but had to work around 2 issues:
|
Yeah, the Spark status is definitely a complex nested type. Would this be a decent manifest of what the layout/mapping is? // array of jobs
[
// job
{
id: jobID,
stages: [
{ id: stageID, progress: { capacity: 5, value: 5 } } }
]
}
] As for why it's a replacement, all the current mimetypes in the notebook are pure strings when sent across and there's not a current diffing strategy. For custom mimetypes I could imagine us handling deltas. My personal opinion for this one is that we can have either a mimetype for progress or spark info that would get rendered appropriately as the JSON data changes (allowing something like React or Phosphor handle progressive renders that keep state for users) or have a mimetype that is more amenable to diffing and rendering on a virtual DOM. I'll post some more thoughts on this in a little bit. |
Borrowing Databrick's screenshot. They have a really nice way to report progress of jobs and stages. About the renderer approach, is it possible for a renderer maintain to maintain the folding states on update_display? Can renderers persist their folding states into the notebook? |
Yeah. I can make a prototype right now.
No. |
@rgbkrk Yes, the layout you describe is correct, although there are more info that goes in each buckets. Also there can be a lot of jobs, stages, tasks so I've been thinking about an minified overall progress bar that aggregate all the steps with an option for the user to expand individual jobs. Probably fancier than what was originally thought, I know :-) Along the same lines, execution progress bar is a generic problem. I think it would be beneficial to have a generic execution progress bar that is more natively integrated to the cell (e.g. small), with a simple update API and perhaps a cancel button. |
Let's open up a new issue for exploring the progress bars since this issue (updating displays) is done. I made a bit of progress on a virtual DOM renderer that could maintain state, need to keep iterating. |
Closing this issue since we've checked all the boxes across ipykernel, ipython, jupyter/notebook, jupyter_client, and nteract. |
The purpose of this message would be to provide an update to a display_data field, without need of full duplex communications.
@ellisonbg suggested that this would make it so that progress bars can be implemented without needing full widgets/comms support. That's exactly the kind of feedback loop I want to be able to provide to users as well.
For me, I want this for informing about any background threads or processes including spark jobs, database queries, and external processes (see ipython/ipython#9969).
Potential approach
On
display_data
, providetransient
field to identify a display_id:Followed by an
update_display_data
:In the notebook, this should also result in storing the last result of the update.
The text was updated successfully, but these errors were encountered: