-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid server restarting when disconnecting from browser #422
Comments
Hi, I wanted to ping this again. Tagging @oshadura |
@fengpinghu could you please take a look on this issue? Honestly it should not be an issue and I checked with @jthiltges and @clundst and restarting flux should also not trigger shutting down server. From zero-tojupyterhub developpers they suggest to check "culling" setup https://z2jh.jupyter.org/en/latest/jupyterhub/customizing/user-management.html |
Indeed, we have the default culling setup enabled for the Coffea-Casa instances running at UC AF. This means that if a notebook kernel remains idle for approximately one hour, the server will be stopped. While the in-memory session will be lost, notebooks are automatically saved at regular intervals (checkpoints). You should be able to recover your last saved state when you log in again. |
Thanks for following up @fengpinghu! We also have several people using terminals instead of notebooks; would it be possible to have such a recovery mechanism for terminals too? (I'll let @marcus-vgr comment on if this is the use-case he's thinking of) |
Thanks @sebastien-rettie for the clarification. |
Hi @fengpinghu , sorry I am a bit confused. We connect to UChicago via https://coffea-dev.af.uchicago.edu and run a "simple" python script in the terminal. It however can take many hours (perhaps ~day) to finish when we have to run over many datasets / variations. From what I understood, if the server disconnects after some idle time, the python script is killed. How can we make sure the job continues running even though we are no longer connected? Would we need to setup the job via e.g. HTCondor? If yes, would we need some extra configuration to assure the dask cluster will be used properly? |
Hi @marcus-vgr, From your description, it sounds like you'd like to keep the Dask cluster alive and run a Python script from the terminal that interacts with it. In that case, the best option would be to extend the notebook server timeout, even if it appears idle. While we don’t want it to run indefinitely, would extending it to 24 hours work for your use case? |
Hi @fengpinghu , yes, I think this would work perfectly! Is this something configured globally by you folks, or can we users modify it for our personal user-cases ? |
Hi @marcus-vgr , great. It's already configured globally. You don't need to do anything. Let us know if you encounter any issues. Thanks! |
To what extent are the
coffea-casa
images dependent on one staying “connected” via the web browser? E.g. if I start a process/command in the terminal from a web browser, I wonder how long I can stay away/disconnected from the web browser before the process is killed/shutdown? I’ve had both cases on the UChicago cluster, where e.g. I close my browser, reopenhttps://coffea-dev.af.uchicago.edu/hub/login
and the process is still running, but also cases where I log back in and it asks me to login/restart the server, so I’m wondering at what point this actually happens?Chatting with @oshadura the duration should be 2 weeks, but a few reports seem to indicate this is not the case currently (at least on the UChicago cluster).
The text was updated successfully, but these errors were encountered: