-
-
Notifications
You must be signed in to change notification settings - Fork 886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Building CuPy with PTDS enabled #3755
Comments
Hint for future readers: PTDS = per-thread default stream Haven't tested but it seems conflicting with CuPy's default stream mechanism? |
Is it possible to override generally what stream CuPy uses? |
I guess if we can locate all places that default the stream to |
ref: numba/numba#5137 |
Right for thrust and cub, but other kernels are compiled at runtime using NVRTC but it seems it does not support PTDS option. |
I thought we also need to pass
Good point...Would be nice to get a confirmation for this. |
I asked offline and was told PTDS should work with NVRTC. https://docs.nvidia.com/cuda/cuda-driver-api/stream-sync-behavior.html#stream-sync-behavior |
cc @pentschev (for vis) |
@jakirkham Can this be closed now since #4322 is merged? |
Is it possible to build CuPy with PTDS enabled? What issues (if any) would one encounter when trying this?
ref: https://developer.nvidia.com/blog/gpu-pro-tip-cuda-7-streams-simplify-concurrency/
The text was updated successfully, but these errors were encountered: