-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running Curator under SLURM Cluster #531
Comments
NeMo-Curator should be able to work both single and multi-node on SLURM clusters with both NeMo-Run which wraps some bash scripts that could be used to set up the cluster manually as well. cc: @ryantwolf If you want to add anything. |
Thanks @ayushdg just needed confirmation on that |
Yeah independently I'm also probably going to make a better integration that does more than just wrap CLI scripts. I'll let you know when I open a PR for it. |
@ryantwolf Just a quick question, is there a tutorial in nemo-curator on running curator under a slurm cluster with nemo-run? |
We have this example: https://github.com/NVIDIA/NeMo-Curator/tree/main/examples/nemo_run |
Thanks for that! It is much simpler then what I was expecting. One thing though, I won't be using docker containers right now. I will in the future once I have a better handling with Nemo and Nemo curator. The script there has references to the docker container but since for Nemo-curator, I am going to do a local install on everything (via the pip method) I should remove all of the lines that reference the location of the docker container |
Hello all,
I have a quick question. I just want to make sure that my workflow is correct and my path to installation is correct.
I am wanting to the entire NeMo framework/eco-system under a SLURM cluster. However, for Curator, it is stated in the readme to run with the Framework Launcher.
Because I would like to get a head start with using NeMo 2.0, I am leaning to working with NeMo-run.
But before I go and start setting up everything, just wanted to check and verify that Curator is still able to run with a SLURM cluster and run with NeMo-run.
The text was updated successfully, but these errors were encountered: