Trainable Quantizers Infrastructure

The trainable infrastructure is a module containing quantization abstraction and quantizers for hardware-oriented model optimization tools such as the Model Compression Toolkit (MCT).

It provides the required abstraction for trainable quantization methods such as quantization-aware training.

It utilizes the Inferable Quantizers Infrastructure provided by the MCT Quantizers package, which proposes the required abstraction for emulating inference-time quantization.

High level description

For each layer, we use a "Quantization Wrapper" to wrap the layer's weight quantizers, and an "Activation Quantization Holder" to hold the activation quantizers. Both components are provided by the MCT Quantizers package. We can choose the quantizers and all the quantization information for each layer by initializing the weights_quantizer and activation_quantizer API.

Notice that the quantization wrapper, holder and the quantizers are implemented per framework.

Quantizers

The quantizers in this module are built upon the "Inferable Quantizer" abstraction (from MCT Quantizers), and define the "Trainable Quantizer" framework, which contains learnable quantization parameters that can be optimized during training.

Details and Examples

More details and "how to" examples for TensorFlow can be found in:

Trainable quantizers for TensorFlow

And for PyTorch:

Trainable quantizers for PyTorch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Trainable Quantizers Infrastructure

High level description

Quantizers

Details and Examples

Files

README.md

Latest commit

History

README.md

File metadata and controls

Trainable Quantizers Infrastructure

High level description

Quantizers

Details and Examples