Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement conv+clip fusion #1412

Merged
merged 7 commits into from
Jul 17, 2019
Merged

implement conv+clip fusion #1412

merged 7 commits into from
Jul 17, 2019

Conversation

tracysh
Copy link
Contributor

@tracysh tracysh commented Jul 15, 2019

Description:
This change implements Conv+Clip activation fusion for FusedConv and NCHWc convolutions. The Clip operation runs in the thread context that is producing the convolution output.

Motivation and Context
This change optimizes models sourced from TF that used Relu6, which is then converted to Clip in ONNX models. As with the other convolution + activation fusions, running this in the convolution threads lets the activation be cheaply parallelized and is also more cache efficient.

For the mobilenet model from mlperf, model time drops from 5->4ms and the ssd_mobilenet_v1_coco drops from 28ms to 25ms. Similar drops are seen when running just "-o 2" with the older NCHW convolution.

@tracysh tracysh requested a review from a team as a code owner July 15, 2019 23:58
@tracysh tracysh requested review from snnn, skottmckay and askhade July 16, 2019 02:24
Copy link
Contributor

@skottmckay skottmckay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@tracysh tracysh merged commit 4383615 into master Jul 17, 2019
@tracysh tracysh deleted the tracysh/conv_clip_fusion branch July 17, 2019 19:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants