Flex model almost working on 4GB vram #595
Replies: 2 comments 1 reply
-
The truth is that I don't quite understand why the T5 model is so heavy????? The truth is that the Flux model has a very inefficient architecture in terms of storage. |
Beta Was this translation helpful? Give feedback.
-
This seems to mean the VAE is missing, which makes sense since you're loading it with You should add these arguments to your command : Maybe then it could work with Q2 quantization at low resolution. Q4 works with 8GB, so hopefully it fits. |
Beta Was this translation helpful? Give feedback.
-
Oops, I tried to load it with -m
The model is too big and slow for my setup, but this line almost works.
Beta Was this translation helpful? Give feedback.
All reactions