Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug]: vram does not seem to be freed when switching models in v5.6.0rc2 #7556

Open
1 task done
The-Istar opened this issue Jan 14, 2025 · 2 comments
Open
1 task done
Labels
bug Something isn't working

Comments

@The-Istar
Copy link

Is there an existing issue for this problem?

  • I have searched the existing issues

Operating system

Linux

GPU vendor

Nvidia (CUDA)

GPU model

RTX 3090

GPU VRAM

24GB

Version number

v5.6.0rc2

Browser

The one from the Launcher. System has Firefox 134.0

Python dependencies

{
"accelerate": "1.0.1",
"compel": "2.0.2",
"cuda": "12.1",
"diffusers": "0.31.0",
"numpy": "1.26.4",
"opencv": "4.9.0.80",
"onnx": "1.16.1",
"pillow": "11.1.0",
"python": "3.11.11",
"torch": "2.4.1+cu121",
"torchvision": "0.19.1",
"transformers": "4.46.3",
"xformers": null
}

What happened

When generating an image with a model and switching models it seems vram is not freed from the previous model. Eventually when switching models the vram fills up and an OOM error pops up.

What you expected to happen

VRAM to be freed to allow loading of the new model

How to reproduce the problem

Switch models.

Additional context

In case it is relevant, the invokeai.yaml file:

Internal metadata - do not edit:

schema_version: 4.0.2

Put user settings here - see https://invoke-ai.github.io/InvokeAI/configuration/:

enable_partial_loading: true
device: cuda:1

Discord username

No response

@The-Istar The-Istar added the bug Something isn't working label Jan 14, 2025
@The-Istar
Copy link
Author

Just tested in RC4 and it still happens there as well.

@The-Istar
Copy link
Author

After doing some more testing it seems this line is the issue:
device: cuda:1

I wanted to run my generations on my second GPU as this has maximum VRAM available and is also slightly faster, but then the issue occurs.
If I run the generations without this line, and thus on my primary GPU all is fine and I can switch models without issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant