Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support GPU inference in torchscript model for v2.5 / v2.6 #188

Merged
merged 13 commits into from
May 18, 2021

Conversation

taoleicn
Copy link
Contributor

@taoleicn taoleicn commented May 12, 2021

This PR works for master branch, v2.5 and v2.6 release

A non-trivial PR to support GPU inference in torchscript

  • Load CUDA kernels as non-python modules; this is needed for torchscript compilation
  • Refactored CUDA APIs as functions that return output as tensors, instead of procedures that modify some passed-in tensors.
  • Added a workaround in case TS tries to locate and compile CUDA methods on machines that don't have CUDA / GPUs
  • The refactored code has passed the forward() & backward() test.
  • I also checked the outputs are the same for the non-torchscript and torchscript versions of the same model.

@taoleicn taoleicn self-assigned this May 12, 2021
@taoleicn taoleicn requested review from hpasapp and cdfox-asapp May 12, 2021 15:06
@taoleicn taoleicn changed the title support GPU on torchscript for v2.5 / v2.6 support GPU inference in torchscript model for v2.5 / v2.6 May 12, 2021
sru/version.py Outdated
@@ -1 +1 @@
__version__ = '2.5.1'
__version__ = '2.6.0.dev2'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh is this 2.x? for some reason I assumed it was 3.x. Though it does say 2.x in the title actually, now that I look ... :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, there are two PRs :P

@hpasapp
Copy link
Collaborator

hpasapp commented May 14, 2021

What pytest tests do we have for the code affected by these changes [edit: or at least, an automated test script, such as bash, that I could run by hand]? I understand we have to run such tests by hand, since CCIE doesnt do that for us.

@taoleicn
Copy link
Contributor Author

What pytest tests do we have for the code affected by these changes [edit: or at least, an automated test script, such as bash, that I could run by hand]? I understand we have to run such tests by hand, since CCIE doesnt do that for us.

@hpasapp The existing unit tests should cover most of the code paths for non-cuda usage. I also modified the C++ test to ensure the changes work for C++ and for CPU inference.

@taoleicn
Copy link
Contributor Author

Also @cdfox-asapp feels we don't have to merge this PR into master, since we need to merge 3.0.0 dev into master anyway. The changes are identical to the PR for v3.0.0. I can make a release on this branch for Chris's project.
What do you think? @hpasapp

@hpasapp
Copy link
Collaborator

hpasapp commented May 14, 2021

Advantage of not merging this branch into master? (not merging it means an additional branch to maintain and reason about.)

@taoleicn
Copy link
Contributor Author

🎉 thanks @hpasapp @cdfox-asapp !

@taoleicn taoleicn merged commit a698784 into master May 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants