exllama vs autogptq #39

surcyf123 · 2023-09-11T09:38:02Z

Need to switch to exllama, everything I'm reading about is how exllama is better. At least for production we will need to switch. Speed is everything at the inference volume we expect. Note to try VLLM too

surcyf123 · 2023-09-11T09:51:11Z

https://oobabooga.github.io/blog/posts/perplexities/

"(updated) bitsandbytes load_in_4bit vs GPTQ + desc_act: load_in_4bit wins in 3 out of 4 tests"

surcyf123 · 2023-09-11T10:04:04Z

oh thats 2 months old

surcyf123 added the low priority label Sep 11, 2023

surcyf123 assigned gee842 Sep 11, 2023

surcyf123 added high priority and removed low priority labels Sep 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

exllama vs autogptq #39

exllama vs autogptq #39

surcyf123 commented Sep 11, 2023 •

edited

Loading

surcyf123 commented Sep 11, 2023

surcyf123 commented Sep 11, 2023

exllama vs autogptq #39

exllama vs autogptq #39

Comments

surcyf123 commented Sep 11, 2023 • edited Loading

surcyf123 commented Sep 11, 2023

surcyf123 commented Sep 11, 2023

surcyf123 commented Sep 11, 2023 •

edited

Loading