Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post training quantization support in TRTorch #44

Merged
merged 21 commits into from
Apr 25, 2020

Conversation

narendasan
Copy link
Collaborator

@narendasan narendasan commented Apr 23, 2020

This PR will add PTQ support in TRTorch using the TensorRT INT8 Calibrator. Adds a new application to demonstrate PTQ functionality where the program will quantize a VGG16 network trained on CIFAR10 (training recipe provided)

API Changes

  • make_int8_calibrator: Generates a class that uses a provided dataloader and a TensorRT calibrator algorithm to run quantization
  • extra_info.max_batch_size: Sets builder.max_batch_size
  • extra_info.strict_type -> extra_info.strict_types

Contract Changes

  • User is now responsible for ensuring that the input tensors are the right type for TRT and they are on the GPU

Bug Fixes

  • Discovered a FP16 accuracy issue, added a new test case and will push a solution to this branch

Will close #35, should fix #41

narendasan and others added 11 commits April 2, 2020 19:03
CIFAR10 for ptq example
Gets about 90-91% accuracy, initial LR 0.01, dropout 0.15

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
- Couple assorted fixes in conversion implementation
- Set up the space to have phase specific settings inside the compiler
- PTQ Calibrator implementation moved to the public API, means Python
will need its own but it probably did anyway
- PTQ now works with dataloader and all the overrides for Calibration
algorithm work
- CIFAR10 Dataloader implementation
- Application still has bugs in reporting accuracy and reading from
calibration cache

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
responsibility of the user to transfer data to GPU and ensure types are
correct.

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
test set for calibration

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
@narendasan narendasan added the WIP Work is in progress, pull request should not be merged yet label Apr 23, 2020
@narendasan narendasan added this to the v0.1.0 milestone Apr 23, 2020
- Now creates output tensors of the correct type to accept data
- There still may be a data race in the creation of the dataloader
iterator
- Quantization and Dynamic Shape right now don't play well together,
potential subsequent release of TRT may address this

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
address datarace issue

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
@narendasan narendasan removed the WIP Work is in progress, pull request should not be merged yet label Apr 24, 2020
@narendasan narendasan changed the title [WIP] Post training quantization support in TRTorch Post training quantization support in TRTorch Apr 24, 2020
@@ -235,7 +235,7 @@ bool VerifyConverterSupportForBlock(const torch::jit::Block* b) {
if (!OpSupported(n)) {
auto schema = n->maybeSchema();
TRTORCH_CHECK(schema, "Unable to get schema for Node " << util::node_info(n) \
<< " (conversion.AddLayer)");
<< " (conversion.VerifyCoverterSupportForBloxk");
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
HACK: WYA tracing without being in eval mode and ignoring the warning,
will follow up with the PyTorch Team and test after script mode support
lands

Signed-off-by: Naren Dasan <[email protected]>
Signed-off-by: Naren Dasan <[email protected]>
@narendasan narendasan merged commit 36d27da into pytorch:master Apr 25, 2020
@narendasan narendasan deleted the ptq branch April 25, 2020 03:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FP16 Execution produces non accurate results Support TensorRT Post Training Quantization
1 participant