-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
{ai}[foss/2022b] PyTorch v1.13.1 w/ CUDA 12.0.0 #18806
base: develop
Are you sure you want to change the base?
{ai}[foss/2022b] PyTorch v1.13.1 w/ CUDA 12.0.0 #18806
Conversation
…hes: PyTorch-1.13.1_add-cuda12-compat.patch, PyTorch-1.13.1_disable-test-sharding.patch, PyTorch-1.13.1_fix-flaky-jit-test.patch, PyTorch-1.13.1_fix-fsdp-fp16-test.patch, PyTorch-1.13.1_fix-fsdp-tp-integration-test.patch, PyTorch-1.13.1_fix-gcc-12-missing-includes.patch, PyTorch-1.13.1_fix-gcc-12-warning-in-fbgemm.patch, PyTorch-1.13.1_fix-kineto-crash-on-exit.patch, PyTorch-1.13.1_fix-numpy-deprecations.patch, PyTorch-1.13.1_fix-protobuf-dependency.patch, PyTorch-1.13.1_fix-pytest-args.patch, PyTorch-1.13.1_fix-python-3.11-compat.patch, PyTorch-1.13.1_fix-test-ops-conf.patch, PyTorch-1.13.1_fix-warning-in-test-cpp-api.patch, PyTorch-1.13.1_fix-wrong-check-in-fsdp-tests.patch, PyTorch-1.13.1_increase-tolerance-test_jit.patch, PyTorch-1.13.1_increase-tolerance-test_ops.patch, PyTorch-1.13.1_increase-tolerance-test_optim.patch, PyTorch-1.13.1_install-vsx-vec-headers.patch, PyTorch-1.13.1_no-cuda-stubs-rpath.patch, PyTorch-1.13.1_remove-flaky-test-in-testnn.patch, PyTorch-1.13.1_skip-failing-grad-test.patch, PyTorch-1.13.1_skip-failing-singular-grad-test.patch, PyTorch-1.13.1_skip-test-requiring-online-access.patch, PyTorch-1.13.1_skip-tests-without-fbgemm.patch
Same for me. |
I made an attempt for a CUDA 11.7 version for 2022b: #18853 |
Test report by @SebastianAchilles |
…asyconfigs into 20230918171336_new_pr_PyTorch1131
@SebastianAchilles After the merge of develop an added patch is no longer included in the PR so you need to update your local repo(s) |
Test report by @SebastianAchilles |
Is it possible that your cluster also doesn't support CUDA 12? Otherwise I don't understand why it would skip those tests. Check the log for something like: it didn't find any CUDA devices |
The NVIDIA driver on this machine is new and
That is why I assume that it should support CUDA 12.
In the EasyBuild log file I only found a few |
Then I'm out of ideas, sorry :/ |
(created using
eb --new-pr
)Our cluster doesn't support CUDA 12 yet (drivers too old), so can't test this.