Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ROCm] Fused convolution+bias+activation #9666

Closed
wants to merge 2 commits into from

Conversation

ekuznetsov139
Copy link
Contributor

This PR enables MIOpen convolution+bias+activation fusion for ROCm and updates fusion unit tests accordingly.

@github-actions github-actions bot added the kokoro:force-run Forces CI to rerun label Feb 20, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Feb 20, 2024
xla/stream_executor/rocm/rocm_dnn.cc Outdated Show resolved Hide resolved
xla/service/gpu/cudnn_fused_conv_rewriter.cc Show resolved Hide resolved
xla/service/gpu/cudnn_fused_conv_rewriter.cc Show resolved Hide resolved
xla/service/gpu/cudnn_fused_conv_rewriter_test.cc Outdated Show resolved Hide resolved
};
std::thread threads[4];
for(int i=0; i<4; i++)
threads[i] = std::thread(executor, i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the motivation for the multi thread test? All threads run the same module, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would potentially flag race conditions due to MIOpen (or CuDNN) executing identical fused operations in different threads.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The race condition would need to make Run(...) return false for this test to catch it, which may or may not happen. Wouldn't running under ThreadSanetizer give you better chances of finding such issues? Also, this test class doesn't seem the right place to test MIOpen or cuDNN for thread safety, in my opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you reach an agreement about this with @thomasjoerg?
It looks like the test is still there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is still there, but it was changed to use RunAndCompare instead of Run, so it would catch mismatches. I don't see any other places where backend implementations of fused convolutions are tested or could be tested for concurrency issues.

As it is, worst thing that might happen is that there might be an issue but the test won't catch it. I can certainly remove the test, but I'm not sure how dropping the test would improve matters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests need to be maintained and in its current form this test will be hard to maintain. When this test breaks, the person looking at the breakage needs to right information to understand the issue.

  • A code comment should explain the motivation for this test, because it is different from the non-threaded tests around it.
  • Please explain where the race condition would occur. I'm not sure here - the threads don't seem to share anything and don't communicate.
  • This test does not test XLA's conv rewriter, but MIOpen / cuDNN, right? This is not obvious and deserves a comment too. Would XLA actually call convs on multiple threads (inter-op parallelism)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The threads don't share anything and they don't communicate, but, at least in the MIOpen case, stream executor will cache and reuse the same "fusion plan" handle for operations with identical parameters (https://github.com/openxla/xla/blob/main/xla/stream_executor/rocm/rocm_dnn.cc#L560), and potentially the same handle may be used in parallel in different threads. CuDNN may be doing something similar at some level.

Since this PR is almost 2 months old by now, rather than argue the point further, I'm going to drop the test. If race conditions are really possible, someone will have fun tracking them down the hard way.

@github-actions github-actions bot added the kokoro:force-run Forces CI to rerun label Feb 22, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Feb 22, 2024
@tdanyluk tdanyluk self-requested a review February 23, 2024 10:01
xla/service/gpu/cudnn_fused_conv_rewriter.cc Outdated Show resolved Hide resolved
xla/service/gpu/cudnn_fused_conv_rewriter_test.cc Outdated Show resolved Hide resolved
xla/stream_executor/rocm/rocm_dnn.cc Outdated Show resolved Hide resolved
xla/stream_executor/rocm/rocm_dnn.cc Outdated Show resolved Hide resolved
xla/stream_executor/rocm/rocm_dnn.cc Outdated Show resolved Hide resolved
xla/stream_executor/rocm/rocm_dnn.cc Show resolved Hide resolved
xla/tests/convolution_test.cc Outdated Show resolved Hide resolved
@github-actions github-actions bot added the kokoro:force-run Forces CI to rerun label Feb 24, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Feb 24, 2024
@github-actions github-actions bot added the kokoro:force-run Forces CI to rerun label Feb 24, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Feb 24, 2024
@github-actions github-actions bot added the kokoro:force-run Forces CI to rerun label Feb 25, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Feb 25, 2024
@github-actions github-actions bot added the kokoro:force-run Forces CI to rerun label Feb 25, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Feb 25, 2024
@ekuznetsov139
Copy link
Contributor Author

Bump

@kamaljeeti
Copy link
Contributor

Hi @tdanyluk , can you look into this once? Thanks.

Copy link
Contributor

@tdanyluk tdanyluk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes and sorry for the delay on this.

Could you fix these 2 macro/ifdef related issues?

xla/service/gpu/cudnn_fused_conv_rewriter_test.cc Outdated Show resolved Hide resolved
xla/service/gpu/cudnn_fused_conv_rewriter_test.cc Outdated Show resolved Hide resolved
@github-actions github-actions bot added the kokoro:force-run Forces CI to rerun label Mar 27, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Mar 27, 2024
@github-actions github-actions bot added the kokoro:force-run Forces CI to rerun label Apr 19, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Apr 19, 2024
@kamaljeeti
Copy link
Contributor

Hi @thomasjoerg , can you please look into this ? Thanks.

@github-actions github-actions bot added the kokoro:force-run Forces CI to rerun label May 8, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label May 8, 2024
@github-actions github-actions bot added the kokoro:force-run Forces CI to rerun label May 13, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label May 13, 2024
copybara-service bot pushed a commit that referenced this pull request May 15, 2024
Imported from GitHub PR #9666

This PR enables MIOpen convolution+bias+activation fusion for ROCm and updates fusion unit tests accordingly.
Copybara import of the project:

--
bd5be49 by Eugene Kuznetsov <[email protected]>:

Switch to NHWC for ROCm and F16

--
8792f89 by Eugene Kuznetsov <[email protected]>:

Fused convolution+bias+activation

Merging this change closes #9666

FUTURE_COPYBARA_INTEGRATE_REVIEW=#9666 from ROCm:ci_fused_conv 8792f89
PiperOrigin-RevId: 633859923
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request May 15, 2024
Imported from GitHub PR openxla/xla#9666

This PR enables MIOpen convolution+bias+activation fusion for ROCm and updates fusion unit tests accordingly.
Copybara import of the project:

--
bd5be494abe6621dfd2c4ebcca4f8992077d8a89 by Eugene Kuznetsov <[email protected]>:

Switch to NHWC for ROCm and F16

--
8792f892770aba54e38a5352d9bec1d003e341d3 by Eugene Kuznetsov <[email protected]>:

Fused convolution+bias+activation

Merging this change closes #9666

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#9666 from ROCm:ci_fused_conv 8792f892770aba54e38a5352d9bec1d003e341d3
PiperOrigin-RevId: 633859923
copybara-service bot pushed a commit that referenced this pull request May 15, 2024
Imported from GitHub PR #9666

This PR enables MIOpen convolution+bias+activation fusion for ROCm and updates fusion unit tests accordingly.
Copybara import of the project:

--
bd5be49 by Eugene Kuznetsov <[email protected]>:

Switch to NHWC for ROCm and F16

--
8792f89 by Eugene Kuznetsov <[email protected]>:

Fused convolution+bias+activation

Merging this change closes #9666

FUTURE_COPYBARA_INTEGRATE_REVIEW=#9666 from ROCm:ci_fused_conv 8792f89
PiperOrigin-RevId: 633859923
copybara-service bot pushed a commit that referenced this pull request May 15, 2024
Imported from GitHub PR #9666

This PR enables MIOpen convolution+bias+activation fusion for ROCm and updates fusion unit tests accordingly.
Copybara import of the project:

--
bd5be49 by Eugene Kuznetsov <[email protected]>:

Switch to NHWC for ROCm and F16

--
8792f89 by Eugene Kuznetsov <[email protected]>:

Fused convolution+bias+activation

Merging this change closes #9666

FUTURE_COPYBARA_INTEGRATE_REVIEW=#9666 from ROCm:ci_fused_conv 8792f89
PiperOrigin-RevId: 633859923
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request May 15, 2024
Imported from GitHub PR openxla/xla#9666

This PR enables MIOpen convolution+bias+activation fusion for ROCm and updates fusion unit tests accordingly.
Copybara import of the project:

--
bd5be494abe6621dfd2c4ebcca4f8992077d8a89 by Eugene Kuznetsov <[email protected]>:

Switch to NHWC for ROCm and F16

--
8792f892770aba54e38a5352d9bec1d003e341d3 by Eugene Kuznetsov <[email protected]>:

Fused convolution+bias+activation

Merging this change closes #9666

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#9666 from ROCm:ci_fused_conv 8792f892770aba54e38a5352d9bec1d003e341d3
PiperOrigin-RevId: 633859923
copybara-service bot pushed a commit that referenced this pull request May 15, 2024
Imported from GitHub PR #9666

This PR enables MIOpen convolution+bias+activation fusion for ROCm and updates fusion unit tests accordingly.
Copybara import of the project:

--
bd5be49 by Eugene Kuznetsov <[email protected]>:

Switch to NHWC for ROCm and F16

--
8792f89 by Eugene Kuznetsov <[email protected]>:

Fused convolution+bias+activation

Merging this change closes #9666

FUTURE_COPYBARA_INTEGRATE_REVIEW=#9666 from ROCm:ci_fused_conv 8792f89
PiperOrigin-RevId: 633859923
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request May 15, 2024
Imported from GitHub PR openxla/xla#9666

This PR enables MIOpen convolution+bias+activation fusion for ROCm and updates fusion unit tests accordingly.
Copybara import of the project:

--
bd5be494abe6621dfd2c4ebcca4f8992077d8a89 by Eugene Kuznetsov <[email protected]>:

Switch to NHWC for ROCm and F16

--
8792f892770aba54e38a5352d9bec1d003e341d3 by Eugene Kuznetsov <[email protected]>:

Fused convolution+bias+activation

Merging this change closes #9666

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#9666 from ROCm:ci_fused_conv 8792f892770aba54e38a5352d9bec1d003e341d3
PiperOrigin-RevId: 633859923
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request May 15, 2024
Imported from GitHub PR openxla/xla#9666

This PR enables MIOpen convolution+bias+activation fusion for ROCm and updates fusion unit tests accordingly.
Copybara import of the project:

--
bd5be494abe6621dfd2c4ebcca4f8992077d8a89 by Eugene Kuznetsov <[email protected]>:

Switch to NHWC for ROCm and F16

--
8792f892770aba54e38a5352d9bec1d003e341d3 by Eugene Kuznetsov <[email protected]>:

Fused convolution+bias+activation

Merging this change closes #9666

PiperOrigin-RevId: 633923105
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request May 15, 2024
…strategy scope.

In some cases `v._distribute_strategy` can be None, which was causing an AttributeError when trying to check if the `extended` field matched the expected value.

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#9666 from ROCm:ci_fused_conv 8792f892770aba54e38a5352d9bec1d003e341d3
PiperOrigin-RevId: 633574887
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request May 15, 2024
The OneDNN matmul rewriter starts a thread pool if one wasn't provided as a compile option, spawning many threads per compilation. If we pass a preexisting thread pool, we can save the cost of creating and tearing down threads.

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#9666 from ROCm:ci_fused_conv 8792f892770aba54e38a5352d9bec1d003e341d3
PiperOrigin-RevId: 633926908
i-chaochen added a commit to ROCm/tensorflow-upstream that referenced this pull request May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants