-
-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SLURM 10 nodes good, 16 nodes error #178
Comments
This sounds to me like something that's specific to your cluster. Did you try with any other numbers other than 10 and 16? I'm most curious about 15... |
15 nodes: similar error
11 nodes: works good.
13 nodes: cancelled due to the wall time I set, 5m or 15m, (can not make sure) but I think it is enough for a simple task like this.
|
I checked it with the manager of our HPC that common user can only use 10 nodes at same time. I will close the issue |
I'm using the HPC with Slurm.
In the HPC, every node has 24 CPUs and I'm permitted to use 16 nodes simultaneously
To test my code, I write a .sh file:
and a "1.2 th2testp.jl" file:
Then I get an error:
But when I change to use 10 nodes with 240 cpus.
The error disappeared. And I got the right answer.
What cause this?
The text was updated successfully, but these errors were encountered: