Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caffe2_nvrtc.dll not found error despite correct requirements/Installation #76

Open
JustCodeIt4Head opened this issue Feb 15, 2025 · 7 comments

Comments

@JustCodeIt4Head
Copy link

Your question

Im currently experimenting a bit with Generational AI and wanted to try ComfyUI-Zluda.
Despite my best efforts to troubleshoot the Issue, I cant seem to get it up and running.
GPU: ASUS 7900XTX OC
CPU: 7950x3D

Depency Double Check:
Latest AMD GPU Driver:
Check

Python 3.10.11 (not MS Store, direct Install):
Image

pip cache purge(before install.bat, after install.bat):
Check

Visual C++ Runtime:
Image

HIP SDK 5.7.1:
Check (removed + installed it again)

System Variable HIP_PATH: (Also tried it with the value from the documentation with double backslash, same result)
Image

Bin folder in the Path System Variable:
Image

Also tried it with the EV "HIP_VISIBLE_DEVICES=1" since I have a 7950x3D and the iGPU is active
I´ve also tried using HIP 6.2.4 with patchzluda2 upgraded from the linked repo, same Error (Deleted 6.2.4 + EV after)

I dont have any more Ideas how to troubleshoot this Issue, since from what I´ve understood, I installed it correctly and meet all the prequisits.
Im running sdnext with Zluda without Issues on the same Machine and have a ComfyUI directml Version in another folder that runs without Issues too.
If someone has a hint where I might go wrong or what maybe causes this Issue, that would be great.

Logs

Other

No response

@patientx
Copy link
Owner

patientx commented Feb 15, 2025

This won't interfere/benefit sdnext with zluda. This is using the most basic approach, just unzips into the main folder, runs that file with the comfy as its argument. So, if everything installed correctly there shouldn't be a caffe dll error. Please check if the target directory has the dll files correctly switched with dll's from zluda as shown on readme.

Edit: Lets test if zluda is correctly setup with comfy. Update your comfyui-zluda (just run the comfyui.bat) and exit. Go into the comfyui-zluda folder in commandline , easiest way to do this is open up the folder in windows explorer , click on the address bar where is says : "This PC > ..." type cmd press enter. This will open up a commandline interface in that folder. Now type :

venv\scripts\activate (press enter)
zluda\zluda.exe -- python zludatest.py (press enter)

Observe if it says zluda installed successfully or what kind of error it produces.

@JustCodeIt4Head
Copy link
Author

JustCodeIt4Head commented Feb 15, 2025

This won't interfere/benefit sdnext with zluda. This is using the most basic approach, just unzips into the main folder, runs that file with the comfy as its argument. So, if everything installed correctly there shouldn't be a caffe dll error. Please check if the target directory has the dll files correctly switched with dll's from zluda as shown on readme.

I checked that as well.
I could fix it by, additionally to force the Primary Video Device in the BIOS, also disabeling the whole Device Controller for my iGPU in the BIOS.
After that I reinstalled RocM 5.7.1 and everything worked as expected.
It seems that RocM / Pytorch is ignoring the HIP_VISIBLE_DEVICES according to Gith: ROCm/pytorch#1895 and pytorch/pytorch#140318 and maybe this causes a Anomaly when Installing?
Interestingly enough I had Sdnext with ZLUDA running on my iGPU the whole time as well. Performance is now way better then with directml (which also ran on my iGPU apparently)

@patientx
Copy link
Owner

This won't interfere/benefit sdnext with zluda. This is using the most basic approach, just unzips into the main folder, runs that file with the comfy as its argument. So, if everything installed correctly there shouldn't be a caffe dll error. Please check if the target directory has the dll files correctly switched with dll's from zluda as shown on readme.

I checked that as well. I could fix it by, additionally to force the Primary Video Device in the BIOS, also disabeling the whole Device Controller for my iGPU in the BIOS. After that I reinstalled RocM 5.7.1 and everything worked as expected. It seems that RocM / Pytorch is ignoring the HIP_VISIBLE_DEVICES according to Gith: ROCm/pytorch#1895 and pytorch/pytorch#140318 and maybe this causes a Anomaly when Installing? Interestingly enough I had Sdnext with ZLUDA running on my iGPU the whole time as well. Performance is now way better then with directml (which also ran on my iGPU apparently)

:) Inform them about this as well. Check maybe if they have an alternative solution cause their userbase is much bigger.

@Z3r0shin
Copy link

Well, I have the same problem but I do not sport any kind of iGPU (my 5600x does not have an igpu) and my only card is a 6800xt dgpu.
No matter if I reinstall RocM 5.7.1, install/reinstall python 3.10.11/3.11, etc., the same as @JustCodeIt4Head, it still outputs

 Traceback (most recent call last):
  File "D:\ComfyUI\main.py", line 134, in <module>
    import comfy.utils
  File "D:\ComfyUI\comfy\utils.py", line 20, in <module>
    import torch
  File "D:\ComfyUI\venv\Lib\site-packages\torch\__init__.py", line 141, in <module>
    raise err
OSError: [WinError 126] The specified module could not be found. Error loading "D:\ComfyUI\venv\Lib\site-packages\torch\lib\caffe2_nvrtc.dll" or one of its dependencies.

@patientx
Copy link
Owner

Check every step again and again. Always something is missed.

@JustCodeIt4Head
Copy link
Author

Well, I have the same problem but I do not sport any kind of iGPU (my 5600x does not have an igpu) and my only card is a 6800xt dgpu. No matter if I reinstall RocM 5.7.1, install/reinstall python 3.10.11/3.11, etc., the same as @JustCodeIt4Head, it still outputs

 Traceback (most recent call last):
  File "D:\ComfyUI\main.py", line 134, in <module>
    import comfy.utils
  File "D:\ComfyUI\comfy\utils.py", line 20, in <module>
    import torch
  File "D:\ComfyUI\venv\Lib\site-packages\torch\__init__.py", line 141, in <module>
    raise err
OSError: [WinError 126] The specified module could not be found. Error loading "D:\ComfyUI\venv\Lib\site-packages\torch\lib\caffe2_nvrtc.dll" or one of its dependencies.

Since you have no iGPU (which seem to cause some Problems on x3D Chips); Just do double check:
-> Do you have set both Variables in the "System" Part of the Environment Variables?
-> Did you Full Install RocM?
-> Did you Reboot after the RocM Installation?
-> Did you perform a pip cache purge before Installation?
-> What Python version does the CMD Command "python --version" return?
-> Did you Install Python directly via Installer or via Windows Store?
-> What Pytroch Version is shown in the list of the command: pip list?
-> Are there any Files and Folders present in the VENV Folder that gets created?

@reeda
Copy link

reeda commented Feb 28, 2025

I had this issue and was able to find the solution from looking at: vladmandic/sdnext#2940
ROCm had not been set on the Path when I installed. After adding it it to the path I was able to load with no issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants