Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report kernel metrics #31

Closed
jtpio opened this issue Apr 7, 2020 · 12 comments
Closed

Report kernel metrics #31

jtpio opened this issue Apr 7, 2020 · 12 comments

Comments

@jtpio
Copy link
Member

jtpio commented Apr 7, 2020

This was briefly mentioned in #22 (comment). Opening a new issue for better tracking.

It would indeed be really useful to track cpu and memory usage per kernel. The frontend could then query this data and display a more granular view on the resources being used.

This sounds like it should be doable when the kernels are local to the notebook server. But it might be slightly more complicated in the case of remote kernels.

@jtpio jtpio changed the title Report metrics per kernel Report kernel metrics Apr 7, 2020
@jtpio
Copy link
Member Author

jtpio commented Apr 8, 2020

cc @tommassino who has a branch with such changes: https://github.com/Tommassino/nbresuse/tree/kernel-metrics

Maybe we could check if this could be merged into nbresuse?

@krinsman
Copy link
Contributor

I agree, this would be a really great feature to have in NBResuse.

For now I think it should be sufficient to state explicitly that the feature is only intended to track usage for "local" kernels. That's the most common use case, even when using JupyterHub where the server is remote but the kernels are local to the server. As long as we're upfront about the limitation, I don't imagine it should be too much of an issue.

If we get this set up for local kernels, then afterwards maybe we can try to get feedback from the people working on Enterprise Gateway about what they would think regarding support for remote kernels and/or how they think it could/would best be implemented. One step at a time though.

@jkleint
Copy link

jkleint commented May 9, 2020

[Cross-posting from #13 at the request of @jtpio]

I'd suggest per-kernel metrics be the default. I came upon nbresuse hoping for more granular information than what top can already tell me. When per-notebook metrics are available, I feel total server usage is not a very useful default, for these reasons:

  1. Total process memory usage is already easily found in top or ps or any system monitoring utility, which is arguably the natural first "go to" place for memory information.
  2. Internal Jupyter- and notebook-specific resource usage cannot be found with general tools, and Jupyter is the first and obvious place to report such info.
  3. If your goal is to monitor bumping up against system limits:
    • This is dependent on the usage of other processes / users
    • Usage of Jupyter, other processes, and system limits are effectively reported by familiar tools
    • You would still want to know which notebook is hitting the limits and/or which notebook would give the biggest savings if shut down.
  4. It is confusing and perhaps misleading for all notebooks to report the same memory usage: this is a giveaway that nbresuse is either wrong or multiple-counting, and decreases confidence in it. Total usage would be better reported in a general server location location like the File | Open page, rather than repeated in notebook-specific locations.
  5. It is difficult to monitor the change in usage for individual cells or pieces of code.

Ideally users could configure the metrics they would like to see, but I think per-notebook metrics make the most sense as the default.

@manics
Copy link

manics commented May 11, 2020

top or ps is the default for experienced users, but Jupyter is often used by beginners or people less familiar with the command line. It may not be obvious that a notebook has failed due to memory since often it'll hang or crash instead of printing an out-of-memory error.
Example use case: jupyterhub/binderhub#1097

Is it possible to show both metrics by default, or do you think that's too confusing?

@jkleint
Copy link

jkleint commented May 15, 2020

Clicking to toggle between notebook and server usage seems like a win/win. I'd be happy with server usage being the default as long as individual notebooks remember my setting so I don't have to change it every time. In fact we might want to throw total system usage into the rotation too.

It seems like the bias is toward isolated single-user cloud environments, where all of the resources belong to you. Even if that is the common case, just keep in mind that being able to finger a single notebook as "the" reason you've run out of memory, is in fact a very special case.

Even in that case, you can mysteriously run out of memory long before you hit the system total, because other processes use memory, too. Imagine a user on an 8GB system with 1 GB used by OS processes. The user's notebook mysteriously crashes at 7GB used, even though it says they have 8 available.

It seems like what you want is a better memory limit value: instead of the limit being total system memory, report the limit as notebook/server usage plus free memory. That way it dynamically accounts for the usage of other processes, and is what you actually have to worry about hitting. In our example, if the user was using 5 GB, with 2 GB free, we would show 5 / 7 GB used. This would probably get rid of a lot of the need to manually set limits as well.

@nishikantparmariam
Copy link

I would also find this feature useful - to have per kernel or per notebook resource usage.

@mlucool
Copy link
Contributor

mlucool commented May 23, 2022

We have been working on this in https://github.com/Quansight/jupyterlab-kernel-usage - take a look and let us know what you think.

@jtpio
Copy link
Member Author

jtpio commented Jun 1, 2022

Nice thanks @mlucool for sharing.

Do you see jupyterlab-kernel-usage living in its own (separate) extension? Or maybe it could be integrated here as part of the jupyter-resource-usage extension?

cc @echarles

@mlucool
Copy link
Contributor

mlucool commented Jun 1, 2022

In the short term, we think it makes sense to live in its own extension as we are still perfecting the experience. Down the line, it could make sense to integrate it back into this one or possibly into core.

@dclong
Copy link

dclong commented Jun 3, 2022

@mlucool Does jupyterlab-kernel-usage work with Python kernels only or does it support kernels of other languages as well? I gave it a try (with JupyterHub instead of JupyterLab), kernels other than Python couldn't be started.

@echarles
Copy link
Member

echarles commented Jun 4, 2022

@dclong jupyterlab-kernel-usage onlly work with ipython kernels https://github.com/ipython/ipykernel.

There is the idea to normalize the resource usage request/reply via a JEP (Jupyter Enhancement Proposal) so that any other kernel could also implement the defined protocol, but that will be a long road.

The jupyterlab extension will work with ipykernel in any deployment (jupyterhub...).

We are planning a new release next week. Please open any issue or feature request on https://github.com/Quansight/jupyterlab-kernel-usage/issues

@jtpio
Copy link
Member Author

jtpio commented Apr 26, 2023

Closing as #163 has now been merged.

@jtpio jtpio closed this as completed Apr 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants