AWQ

Jump to bottom

A Bhat edited this page Apr 14, 2024 · 1 revision

About

Activation-aware Weight Quantization (AWQ)
Aimed to be a hardware-friendly approach for LLM low-bit weight-only quantization

See also

Clone this wiki locally