Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[MKLDNN] Enable signed int8 support for convolution. #13697

Merged
merged 45 commits into from
Feb 10, 2019

Conversation

ZhennanQin
Copy link
Contributor

@ZhennanQin ZhennanQin commented Dec 20, 2018

Description

Major changes:

  • Integrate mkldnn signed int8 convolution.
  • Introduce new quantization flow out_type auto, which automatically determines the out_type according to calibration information. If negative value is detected from calibration min, out_type will be int8, otherwise, out_type will be uint8.
  • Introduce new operator quantize_v2, which only has 1 input data and will self-calculate min/max in runtime or use calibration information set by attrs. Then calibration informations are all recorded in model file instead of part in params file. User can easily modify them without touching params file. This operator also used to implement out_type auto.
  • Add new parameter monitor_all for monitor module to allow monitoring both input & output of operators. This is used for collecting calibration information of in_data to support quantize first layer of graph.

@pengzhao-intel @TaoLv @zheng-da @reminisce @azai91

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at http://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

@ZhennanQin
Copy link
Contributor Author

ZhennanQin commented Dec 20, 2018

@marcoabreu Looks like CI has trouble to build this PR. Some builds failed to compile this line
https://github.com/apache/incubator-mxnet/blob/84131807d67bb6a256b78cd75bb12a274c22347b/include/mxnet/tensor_blob.h#L290
with comparison between signed and unsigned integer expressions. But this PR doesn't touch that part,and looking into the code, both sides are returning size_t, which should be all same. Another prove is, clang can build pass with -Wsign-compare, which indicated that the gcc build has some problem. Can you have a look? Thanks.

Update: I found the reason, the right side returns from mshadow::Shape::Size(void), which is index_t.
https://github.com/dmlc/mshadow/blob/696803bd7723ade8230af878460d96c68a550fbc/mshadow/tensor.h#L126
So comparing size_t with index_t causing build failed. Question: why our CI didn't report this previously, and even not from other PRs sent at same time? Does ccache hide this issue?

@pengzhao-intel
Copy link
Contributor

@KellenSunderland could you help take a look for the CI issue?

@pengzhao-intel
Copy link
Contributor

FYI @yoel-shapiro

@TaoLv TaoLv added Operator MKLDNN Quantization Issues/Feature Requests related to Quantization labels Dec 21, 2018
@yoel-shapiro
Copy link

yoel-shapiro commented Dec 23, 2018

Hi, I'm still getting an issue when using "int8"

setup:
I rebuilt MKLDNN master
cd mxnet-incubator, git pull origin pull/13697/head
verified that the script works with "uint8"

I am able to run calibration but when I use the quantized output model I get the following error, specifically when trying to read the model's predictions:

Traceback (most recent call last):
File "inference_dsalite.py", line 201, in
prob = pred[k_head][k_im].asnumpy()
File "/home/local/ANT/yoelsh/work/mxcode/incubator-mxnet/python/mxnet/ndarray/ndarray.py", line 1987, in asnumpy
ctypes.c_size_t(data.size)))
File "/home/local/ANT/yoelsh/work/mxcode/incubator-mxnet/python/mxnet/base.py", line 252, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [14:12:29] src/operator/nn/mkldnn/mkldnn_base.cc:326: Unknown MKLDNN format for 4 dimensions: 47

@ZhennanQin
Copy link
Contributor Author

@yoel-shapiro Thanks for trying MKLDNN int8 solution. Did you apply this PR? This PR removed line 346 check, and did lots of changes to support int8. Please try int8 with this PR.

@ZhennanQin
Copy link
Contributor Author

@yoel-shapiro PR is updated to address the issue you mentioned. Please try again. Thanks for reporting this.

@yoel-shapiro
Copy link

fixed!
thank you!

@pengzhao-intel
Copy link
Contributor

@marcoabreu Looks like CI has trouble to build this PR. Some builds failed to compile this line
......
So comparing size_t with index_t causing build failed. Question: why our CI didn't report this previously, and even not from other PRs sent at same time? Does ccache hide this issue?

@marcoabreu Sorry to bother you. Currently, the PR is blocked by CI failure as @ZhennanQin mentioned above.
Would you help take a look for the CI issue after the holiday? Thanks in advance.

example/quantization/imagenet_gen_qsym_mkldnn.py Outdated Show resolved Hide resolved
include/mxnet/c_api.h Outdated Show resolved Hide resolved
include/mxnet/c_api.h Outdated Show resolved Hide resolved
include/mxnet/executor.h Outdated Show resolved Hide resolved
python/mxnet/contrib/quantization.py Outdated Show resolved Hide resolved
src/operator/quantization/mkldnn/mkldnn_quantize_v2-inl.h Outdated Show resolved Hide resolved
src/operator/quantization/quantize_v2-inl.h Outdated Show resolved Hide resolved
src/operator/quantization/quantize_v2-inl.h Show resolved Hide resolved
@reminisce
Copy link
Contributor

@ZhennanQin This PR has touched the code shared by GPU and CPU quantization. It's better to verify that no bugs have been introduced to affect the GPU test accuracy. Thanks.

szha
szha previously requested changes Feb 3, 2019
Copy link
Member

@szha szha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API change that @sergeykolychev pointed out needs to be addressed first. Requesting change to make it clear.

@pengzhao-intel
Copy link
Contributor

Thanks for the suggestions and comments @szha @reminisce @sergeykolychev
We are on the Chinese holiday now and will address all comments when coming back to the office.

@szha szha dismissed their stale review February 4, 2019 04:12

API concern is addressed. Thanks for the quick turnaround.

@ZhennanQin
Copy link
Contributor Author

@reminisce As you have concern about entropy change, I removed the parts that for uint8 fix, as it's not the major task of this PR. Other entropy change are re-applying the fix of entropy which removed by mistake. Please review again. Thanks.

@xinyu-intel
Copy link
Contributor

@reminisce
I tested resnet152 and inception-bn on Tesla V100 and resnet152 looks good. However, there is a quantization bug with inception-bn because #13297 enabled quantized_concat on CPU side. It seems that #14060 is fixing this bug. Below are accuracy numbers of these two models(after apply #14060 ).

MODE ResNet152 Inception-bn
FP32 77.18%/93.00% 72.38%/90.61%
online 75.46%/92.24% 72.08%/90.31%
5 batch naive 75.41%/92.10% 71.81%/90.29%

@reminisce
Copy link
Contributor

@xinyu-intel Thanks for the verification. I have approved PR #14060, but I cannot merge that PR because it hasn't passed CI test.

@reminisce
Copy link
Contributor

PR #14060 is merged. Please rebase to trigger CI again. Thanks.

@pengzhao-intel
Copy link
Contributor

@szha @reminisce @KellenSunderland @TaoLv all comments are resolved now.
Could you help take a look again and merge the PR in case no other new comment?

@TaoLv TaoLv merged commit 8b4a69a into apache:master Feb 10, 2019
@ZhennanQin ZhennanQin mentioned this pull request Feb 14, 2019
7 tasks
stephenrawls pushed a commit to stephenrawls/incubator-mxnet that referenced this pull request Feb 16, 2019
* Enable s8s8 support for MKLDNN convolution.

* Fix cpp build

* Fix build.

* Fix build

* Remove openmp min/max reduction for windows build

* Add mkldnn_OIhw4i16o4i_s8s8 support

* Add all s8s8 weight format

* Change ssd quantize script.

* Update

* Manually cast mshadow shape size to size_t

* Fix merge.

* Fix perl package.

* Retrigger CI

* Fix GPU test

* Fix GPU test

* Rerun CI

* Rerun CI

* Rerun CI

* Rerun CI

* Remove weight_channelwise_scale from params.

* Fix

* Keep API compatible.

* Rerun CI

* Rerun CI

* Rerun CI

* Rerun CI

* Address comments.

* fix.

* Address debug build.

* Add comment for next_impl

* Rerun ci

* Add new api MXExecutorSetMonitorCallbackEX

* Add default value for monitor_all for cpp header.

* Rerun CI

* fix

* script change for uint8.

* trigger ci

* trigger ci
drivanov pushed a commit to drivanov/incubator-mxnet that referenced this pull request Mar 4, 2019
* Enable s8s8 support for MKLDNN convolution.

* Fix cpp build

* Fix build.

* Fix build

* Remove openmp min/max reduction for windows build

* Add mkldnn_OIhw4i16o4i_s8s8 support

* Add all s8s8 weight format

* Change ssd quantize script.

* Update

* Manually cast mshadow shape size to size_t

* Fix merge.

* Fix perl package.

* Retrigger CI

* Fix GPU test

* Fix GPU test

* Rerun CI

* Rerun CI

* Rerun CI

* Rerun CI

* Remove weight_channelwise_scale from params.

* Fix

* Keep API compatible.

* Rerun CI

* Rerun CI

* Rerun CI

* Rerun CI

* Address comments.

* fix.

* Address debug build.

* Add comment for next_impl

* Rerun ci

* Add new api MXExecutorSetMonitorCallbackEX

* Add default value for monitor_all for cpp header.

* Rerun CI

* fix

* script change for uint8.

* trigger ci

* trigger ci
vdantu pushed a commit to vdantu/incubator-mxnet that referenced this pull request Mar 31, 2019
* Enable s8s8 support for MKLDNN convolution.

* Fix cpp build

* Fix build.

* Fix build

* Remove openmp min/max reduction for windows build

* Add mkldnn_OIhw4i16o4i_s8s8 support

* Add all s8s8 weight format

* Change ssd quantize script.

* Update

* Manually cast mshadow shape size to size_t

* Fix merge.

* Fix perl package.

* Retrigger CI

* Fix GPU test

* Fix GPU test

* Rerun CI

* Rerun CI

* Rerun CI

* Rerun CI

* Remove weight_channelwise_scale from params.

* Fix

* Keep API compatible.

* Rerun CI

* Rerun CI

* Rerun CI

* Rerun CI

* Address comments.

* fix.

* Address debug build.

* Add comment for next_impl

* Rerun ci

* Add new api MXExecutorSetMonitorCallbackEX

* Add default value for monitor_all for cpp header.

* Rerun CI

* fix

* script change for uint8.

* trigger ci

* trigger ci
haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019
* Enable s8s8 support for MKLDNN convolution.

* Fix cpp build

* Fix build.

* Fix build

* Remove openmp min/max reduction for windows build

* Add mkldnn_OIhw4i16o4i_s8s8 support

* Add all s8s8 weight format

* Change ssd quantize script.

* Update

* Manually cast mshadow shape size to size_t

* Fix merge.

* Fix perl package.

* Retrigger CI

* Fix GPU test

* Fix GPU test

* Rerun CI

* Rerun CI

* Rerun CI

* Rerun CI

* Remove weight_channelwise_scale from params.

* Fix

* Keep API compatible.

* Rerun CI

* Rerun CI

* Rerun CI

* Rerun CI

* Address comments.

* fix.

* Address debug build.

* Add comment for next_impl

* Rerun ci

* Add new api MXExecutorSetMonitorCallbackEX

* Add default value for monitor_all for cpp header.

* Rerun CI

* fix

* script change for uint8.

* trigger ci

* trigger ci
@ZhennanQin ZhennanQin deleted the s8_conv branch September 16, 2019 07:01
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
MKLDNN Operator Quantization Issues/Feature Requests related to Quantization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants