implement conv+clip fusion #1412

tracysh · 2019-07-15T23:58:26Z

Description:
This change implements Conv+Clip activation fusion for FusedConv and NCHWc convolutions. The Clip operation runs in the thread context that is producing the convolution output.

Motivation and Context
This change optimizes models sourced from TF that used Relu6, which is then converted to Clip in ONNX models. As with the other convolution + activation fusions, running this in the convolution threads lets the activation be cheaply parallelized and is also more cache efficient.

For the mobilenet model from mlperf, model time drops from 5->4ms and the ssd_mobilenet_v1_coco drops from 28ms to 25ms. Similar drops are seen when running just "-o 2" with the older NCHW convolution.

skottmckay

conv+clip fusion

f3e19b3

tracysh requested a review from a team as a code owner July 15, 2019 23:58

disable NCHWc tests on unsupported platforms

56148f7

tracysh requested review from snnn, skottmckay and askhade July 16, 2019 02:24

tracysh added 3 commits July 16, 2019 12:14

fix gcc alias warning

778eb16

Merge branch 'master' into tracysh/conv_clip_fusion

c5bde45

fix clang style warning

8dda4c6

skottmckay approved these changes Jul 17, 2019

View reviewed changes

tracysh and others added 2 commits July 16, 2019 22:35

Merge branch 'master' into tracysh/conv_clip_fusion

b459c62

Merge branch 'master' into tracysh/conv_clip_fusion

14c4f19

tracysh merged commit 4383615 into master Jul 17, 2019

tracysh deleted the tracysh/conv_clip_fusion branch July 17, 2019 19:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

implement conv+clip fusion #1412

implement conv+clip fusion #1412

tracysh commented Jul 15, 2019

skottmckay left a comment

implement conv+clip fusion #1412

implement conv+clip fusion #1412

Conversation

tracysh commented Jul 15, 2019

skottmckay left a comment

Choose a reason for hiding this comment