Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add quantized PyTorch models in model builder (#600)
### Description This PR adds support for building the final ONNX models that are optimized and quantized from already-quantized PyTorch models. ### Motivation and Context Quantization methods supported for already-quantized PyTorch models are [GPTQ](https://github.com/AutoGPTQ/AutoGPTQ) and [AWQ](https://github.com/casper-hansen/AutoAWQ). Currently, only INT4 precision is supported.
- Loading branch information