Skip to content
A Bhat edited this page Apr 14, 2024 · 1 revision

About

  • Activation-aware Weight Quantization (AWQ)
  • Aimed to be a hardware-friendly approach for LLM low-bit weight-only quantization

See also

Clone this wiki locally