Post training quantization support in TRTorch #44

narendasan · 2020-04-23T18:52:29Z

This PR will add PTQ support in TRTorch using the TensorRT INT8 Calibrator. Adds a new application to demonstrate PTQ functionality where the program will quantize a VGG16 network trained on CIFAR10 (training recipe provided)

API Changes

make_int8_calibrator: Generates a class that uses a provided dataloader and a TensorRT calibrator algorithm to run quantization
extra_info.max_batch_size: Sets builder.max_batch_size
extra_info.strict_type -> extra_info.strict_types

Contract Changes

User is now responsible for ensuring that the input tensors are the right type for TRT and they are on the GPU

Bug Fixes

Discovered a FP16 accuracy issue, added a new test case and will push a solution to this branch

Will close #35, should fix #41

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

CIFAR10 for ptq example Gets about 90-91% accuracy, initial LR 0.01, dropout 0.15 Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

- Couple assorted fixes in conversion implementation - Set up the space to have phase specific settings inside the compiler - PTQ Calibrator implementation moved to the public API, means Python will need its own but it probably did anyway - PTQ now works with dataloader and all the overrides for Calibration algorithm work - CIFAR10 Dataloader implementation - Application still has bugs in reporting accuracy and reading from calibration cache Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

responsibility of the user to transfer data to GPU and ensure types are correct. Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

test set for calibration Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

- Now creates output tensors of the correct type to accept data - There still may be a data race in the creation of the dataloader iterator - Quantization and Dynamic Shape right now don't play well together, potential subsequent release of TRT may address this Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

address datarace issue Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

narendasan · 2020-04-24T22:03:25Z

core/conversion/conversion.cpp

@@ -235,7 +235,7 @@ bool VerifyConverterSupportForBlock(const torch::jit::Block* b) {
        if (!OpSupported(n)) {
            auto schema = n->maybeSchema();
            TRTORCH_CHECK(schema, "Unable to get schema for Node " << util::node_info(n) \
-                                    << " (conversion.AddLayer)");
+                                    << " (conversion.VerifyCoverterSupportForBloxk");


Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

HACK: WYA tracing without being in eval mode and ignoring the warning, will follow up with the PyTorch Team and test after script mode support lands Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

narendasan and others added 11 commits April 2, 2020 19:03

feat(//core/quantization): skeleton of INT8 PTQ calibrator

dd443a6

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

Merge branch 'master' of https://github.com/NVIDIA/trtorch into ptq

8580106

feat(//cpp/ptq/training): Training recipe for VGG16 Classifier on

676bf56

CIFAR10 for ptq example Gets about 90-91% accuracy, initial LR 0.01, dropout 0.15 Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

feat(//cpp/api): Remove the extra includes in the API header

2f86f84

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

feat(//core/execution): Type checking for the executor, now is the

2dd1ba3

responsibility of the user to transfer data to GPU and ensure types are correct. Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

bug(//tests): Test to reproduce FP16 accuracy issue

0050f0e

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

fix(//cpp/api): Remove unecessary destructor in ptq class

fc70267

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

feat(//cpp/api): Adding max batch size setting

1b25542

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

test(//tests/modules): Add FP16 test to testsuite

4a8dc6e

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

feat(//cpp/ptq): Add a feature to the dataset to use less than the full

5f36f47

test set for calibration Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

narendasan added the WIP Work is in progress, pull request should not be merged yet label Apr 23, 2020

narendasan added this to the v0.1.0 milestone Apr 23, 2020

narendasan added 6 commits April 23, 2020 17:00

fix(//cpp/api): Better inital condition for the dataloader iterator to

8d22bdd

address datarace issue Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

fix(//core/conversion): Check for calibrator before setting int8 mode

3afd209

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

fix(//cpp/api): set a default for calibrator

825be69

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

fix(//cpp/ptq): remove some logging from ptq app

b989c7f

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

feat(//tests): New optional accuracy tests to check INT8 and FP16

df74136

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

narendasan removed the WIP Work is in progress, pull request should not be merged yet label Apr 24, 2020

narendasan changed the title ~~[WIP] Post training quantization support in TRTorch~~ Post training quantization support in TRTorch Apr 24, 2020

narendasan added 2 commits April 24, 2020 14:55

docs(//cpp/ptq): READMEs and documentation for running the PTQ example

6facbf1

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

docs(//cpp/ptq): Last comment cleanup

46bb485

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

narendasan commented Apr 24, 2020

View reviewed changes

narendasan added 2 commits April 24, 2020 18:40

fix: Address issues in PR

cd24f26

Signed-off-by: Naren Dasan <[email protected]> Signed-off-by: Naren Dasan <[email protected]>

narendasan merged commit 36d27da into pytorch:master Apr 25, 2020

narendasan deleted the ptq branch April 25, 2020 03:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Post training quantization support in TRTorch #44

Post training quantization support in TRTorch #44

narendasan commented Apr 23, 2020 •

edited

Loading

narendasan Apr 24, 2020

Post training quantization support in TRTorch #44

Post training quantization support in TRTorch #44

Conversation

narendasan commented Apr 23, 2020 • edited Loading

API Changes

Contract Changes

Bug Fixes

narendasan Apr 24, 2020

Choose a reason for hiding this comment

narendasan commented Apr 23, 2020 •

edited

Loading