[Feature]: Support hqq model on vllm-gptq #2

Minami-su · 2024-04-15T11:15:54Z

🚀 The feature, motivation and pitch

https://mobiusml.github.io/hqq_blog/

HQQ is a fast and accurate model quantizer that skips the need for calibration data. It's super simple to implement (just a few lines of code for the optimizer). It can crunch through quantizing the Llama2-70B model in only 4 minutes! 🚀

Hope to use hqq model on vllmgptq.

Alternatives

No response

Additional context

No response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Support hqq model on vllm-gptq #2

[Feature]: Support hqq model on vllm-gptq #2

Minami-su commented Apr 15, 2024

[Feature]: Support hqq model on vllm-gptq #2

[Feature]: Support hqq model on vllm-gptq #2

Comments

Minami-su commented Apr 15, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context