-
Notifications
You must be signed in to change notification settings - Fork 23.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
On Kaggle : libcusparse.so.12: undefined symbol: __nvJitLinkComplete_12_4, version libnvJitLink.so.12 #134929
Comments
This sounds to me like a great topic one should ask at https://discuss.pytorch.org though we should also extend collect_env to print information about |
Well it was working when ComfyUI was not using Torch 2.4 but it started after they moved. I think Kaggle still has by default Torch 2.3. Do you know how can I fix this issue? I tried so many commands none worked :/ So many people waiting me to fix this issue if i can. Thank you so much |
Facing the same issue, it seems like the issue only occurs when using a notebook, with cuda 12.4 both of these are workarounds though :/ |
how do we downgrade notebook cuda version? |
@sarihl is there a document somewhere that I can read thru that documents the installation process? I can run torch-2.4 from jupyter notebook just fine |
You run it inside kaggle? |
Can you share a link to the notebook? I've tried running it and it seems to work fine for me: https://www.kaggle.com/code/malfet/check-torch-version |
here here the notebook to be able to see it you need to connect via ngrok and install famous swarmui and make it install comfyui backend it is so easy and straight forward actually |
Side note but @malfet just be aware that using Kaggle for running Diffusion WebUIs is against their ToS so you might wanna be careful with your Kaggle account when trying that, as FurkanGozukara here also surely knows since he even commented on a thread about it there where a Kaggle staff memeber explains that https://www.kaggle.com/discussions/product-feedback/440296 |
@willlllllio thank you for the warning. I wasn't aware of that. At this point, it does not seem like a PyTorch issue, but may be a bug with SwarmUI or whatever that creates custom environment that forces libtorch to link against wrong nvjitlink. So I think |
@albanD showed me a reproducer, we need to add |
I am having a similar issue, but using Paperspace VM, not Kaggle... |
FWIW I ran into this issue on a machine with system runtime CUDA 12.2 (as reported by My current workaround for now is to just downgrade to a version of torch for a previous version of CUDA:
|
This issue is really persisting and looks serious. A thread is present here : #111469 |
bumping priority for activity and the fact that we have a good idea how to fix |
I still run into this every day. I'm in a venv, so my solution was to add This is also the same issue as #111469, which is closed by the author because they were unblocked, but the issue didn't get resolved in the general case. My workaround is that I switch to the venv, then paste in LD_LIBRARY_PATH=$(python -c "import site; print(site.getsitepackages()[0] + '/nvidia/nvjitlink/lib')"):$LD_LIBRARY_PATH I would put in the PR to fix it myself, but I'm not quite sure how the import resolves object files or where the code that searches for it is. |
Anyone wants to try latest nightly, which includes #141063 (will be included in 2.6 release) that to the best of my understanding fixes the problem, though I have not tired reproducing in on kaggle |
Hi @FurkanGozukara can you please confirm this is fixed ? Using following install command. This should installl torch 2.6 release candidate:
|
I checked current nightly version fixes the issue. Check on Nightly Version --> worked
Yes. It worked on Install nightly version of pytorch
import pytorch --> workedimport torch
torch.__version__
Workaround for older versionCheck NVCC version
Install pytorch that matches nvcc CUDA version
Set path to
|
@bilzard thank you so much i am glad this is getting fixed |
Could someone please confirm if its works with latest release 2.6:
And we can close this issue. |
@atalman I checked the latest release
import torch
torch.__version__
|
Closing this issue . Resolved in nightly and release 2.6 |
🐛 Describe the bug
I have tried everything but no luck
Waiting your inputs to try more
I tried torch 2.4.0, 2.5 - dev, cu 118, cu121 and cu124 - all same error
This below code - I got the same error when using famous ComfyUI via SwarmUI
giving below error
Versions
cc @ezyang @gchanan @zou3519 @kadeng @msaroufim @seemethere @malfet @osalpekar @atalman @alexsamardzic @nikitaved @pearu @cpuhrsch @amjames @bhosmer @jcaip @ptrblck @eqy
The text was updated successfully, but these errors were encountered: