Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenVINO backend] OpenVINOQuantizer #15

Merged

Conversation

daniil-lyakhov
Copy link

Summary

  • OpenVINOQuantizer is introduced
  • aot_openvino_compiler.py is updated with the quantization pipeline (timm and torchvision backends only)
  • openvino_executor_runner.cpp is updated to take several inputs / produce several outputs in a row. This allows to validate models converted to the edge (inspired by the qualcom example) (code style is refactored by clang-format)
  • aot_openvino_compiler.py is updated with a validation pipeline (timm and torchvision backends only)

Test plan

Model FP32 acc INT8 acc FP32 avg latency INT8 avg latency Batch size
Resnet50d 0.80534 0.8038 538 178.167 125

preset=preset, model_type=model_type, **kwargs
)

def set_ignored_scope(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function be exposed to the user so that they can tune the quantization process? If so, it will be good to mention in the documentation

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree! But I'm not sure which documentation to update: we don't have a document with describe the OpenVINOQuantizer yet. Should we perhaps create one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants