Skip to content

Commit

Permalink
fix comments
Browse files Browse the repository at this point in the history
  • Loading branch information
elad-c committed Mar 21, 2024
1 parent 85b0689 commit f30f846
Show file tree
Hide file tree
Showing 3 changed files with 20 additions and 9 deletions.
12 changes: 9 additions & 3 deletions FAQ.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,12 @@
# FAQ

## 1. Quantized model size is the same as the original model size
**Table of Contents:**
1. [Why does the size of the quantized model remain the same as the original model size?](#1-why-does-the-size-of-the-quantized-model-remain-the-same-as-the-original-model-size)
2. [Why does loading a quantized exported model from a file fail?](#2-why-does-loading-a-quantized-exported-model-from-a-file-fail)
3. [Why am I getting a torch.fx error?](#3-why-am-i-getting-a-torchfx-error)


## 1. Why does the size of the quantized model remain the same as the original model size?

MCT performs a process known as *fake quantization*, wherein the model's weights and activations are still represented in a floating-point
format but are quantized to represent a maximum of 2^N unique values (for N-bit cases).
Expand All @@ -12,7 +18,7 @@ Note that the IMX500 converter accepts the "fake quantization" model and support
For more information and an implementation example, check out the [INT8 TFLite export tutorial](https://github.com/sony/model_optimization/blob/main/tutorials/notebooks/keras/export/example_keras_export.ipynb)


## 2. Loading exported models
## 2. Why does loading a quantized exported model from a file fail?

The models MCT exports contain QuantizationWrappers and Quantizer objects that define and quantize the model at inference.
These objects are custom layers and layer wrappers created by MCT (defined in an external repository: [MCTQ](https://github.com/sony/mct_quantizers)),
Expand All @@ -35,7 +41,7 @@ PyTorch models can be exported as onnx models. An example of loading a saved onn
Inference on the target platform (e.g. the IMX500) is not affected by this latency.


## 3. Error parsing model with torch.fx
## 3. Why am I getting a torch.fx error?

When quantizing a PyTorch model, MCT's initial step involves converting the model into a graph representation using `torch.fx`.
However, `torch.fx` comes with certain common limitations, with the primary one being its requirement for the computational graph to remain static.
Expand Down
9 changes: 3 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,7 @@ MCT is developed by researchers and engineers working at Sony Semiconductor Isra
- [Getting Started](#getting-started)
- [Supported features](#supported-features)
- [Results](#results)
- [Quantization Troubleshooting](#quantization-trouble-shooting)
- [FAQ](#faq)
- [Troubleshooting](#trouble-shooting)
- [Contributions](#contributions)
- [License](#license)

Expand Down Expand Up @@ -162,13 +161,11 @@ Results for applying pruning to reduce the parameters of the following models by
| DenseNet121 [3] | 74.44 | 71.71 |


## Quantization Trouble Shooting
## Trouble Shooting

If the accuracy of the quantized model is too large for your application, check out the [Quantization Troubleshooting](https://github.com/sony/model_optimization/tree/main/quantization_troubleshooting.md)
If the accuracy degradation of the quantized model is too large for your application, check out the [Quantization Troubleshooting](https://github.com/sony/model_optimization/tree/main/quantization_troubleshooting.md)
for common pitfalls and some tools to improve quantization accuracy.

## FAQ

Check out the [FAQ](https://github.com/sony/model_optimization/tree/main/FAQ.md) for common issues.


Expand Down
8 changes: 8 additions & 0 deletions quantization_troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,14 @@ Some steps may be applicable to your model, while others may not.
Throughout this document we refer the user to notebooks using the Keras framework. There are similar notebooks for PyTorch.
All notebooks are available [here](https://github.com/sony/model_optimization/tree/main/tutorials/notebooks).

**Table of Contents:**
* [Representative Dataset](#representative-dataset)
* [Quantization Process](#quantization-process)
* [Model Structure Quantization Issues](#model-structure-quantization-issues)
* [Advanced Quantization Methods](#advanced-quantization-methods)
* [Debugging Tools](#debugging-tools)


___
## Representative Dataset
The representative dataset is used by the MCT to derive the threshold values of activation tensors in the model.
Expand Down

0 comments on commit f30f846

Please sign in to comment.