Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MetaSchedule] Fix Dynamic Loop from AutoBinding #13421

Merged
merged 8 commits into from
Nov 19, 2022

Conversation

zxybazh
Copy link
Member

@zxybazh zxybazh commented Nov 17, 2022

This PR fixes an issue that would cause dynamic loop when reorder is applied during auto binding incorrectly. This would cause failed cuda kernel generation. This PR provided a fix by changing the loop to fuse blocks.

Test is also added and locally tested on nvidia/geforce-rtx-3070 target.

Thanks for @junrushao for the guidance and @yelite for reporting this issue.

@tvm-bot
Copy link
Collaborator

tvm-bot commented Nov 17, 2022

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

Generated by tvm-bot

@zxybazh zxybazh marked this pull request as ready for review November 17, 2022 20:10
@zxybazh
Copy link
Member Author

zxybazh commented Nov 17, 2022

CC @junrushao @Hzfengsy

@zxybazh
Copy link
Member Author

zxybazh commented Nov 19, 2022

Test integrated into the cuda winograd search space. Please take a 2nd look, thanks!

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@junrushao junrushao merged commit 24b7d9f into apache:main Nov 19, 2022
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
This PR fixes an issue that would cause dynamic loop when reorder is applied during auto binding incorrectly. This would cause failed cuda kernel generation. This PR provided a fix by changing the loop to fuse blocks.

Test is also added and locally tested on `nvidia/geforce-rtx-3070` target.

Thanks for Junru Shao for the guidance and Lite Ye for reporting this issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants