-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Add method to copy device_buffer back to host memory #219
Conversation
This allows users to copy an `rmm::device_buffer` back to host memory given a pointer to the destination. It is up to the user to ensure they have allocated enough space to perform the copy and that the pointer is in fact on the host.
Make sure that we are able to use the `copy_to_host` method from Cython.
Allocate a `bytes` object that is uninitialized and then copy the data from the underlying `rmm::device_buffer` into the pointer held within `bytes`.
4189071
to
3d29c34
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python side looks good. Would love to be able to use Pinned Host memory in the future but that's a future optimization and will likely involve some weird Python workarounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, @jakirkham would you mind adding some unit tests for this on the Python side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python / Cython look good
As we need to perform synchronization at the Cython level, ensure we can call `cudaStreamSynchronize` by externing it.
0597262
to
f57934f
Compare
To ensure we have a valid `bytes` object with data in it, synchronize at the Cython level. Besides we no longer synchronize in C++. So we cannot rely on that any more.
f57934f
to
12a83af
Compare
Instead of taking a `cudaStream_t` typed `stream` value in `tobytes`, take a `uintptr_t` and cast it to `cudaStream_t` later. This way we can leverage Cython's knowledge of how to coerce Python objects to `uintptr_t`. Then it is a simple matter to cast the `uintptr_t` to `cudaStream_t` later.
12a83af
to
10601c9
Compare
Sorry for the churn. Think this is ready for another review @jrhemstad. Also @kkraus14 you may want to take another look as the Python side has changed a bit (though not dramatically). |
Co-Authored-By: Jake Hemstad <[email protected]>
rerun tests (Conda package failed to download from server in one build) |
rerun tests (Only the |
rerun tests (auto-complete 🙄) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor issue with how a stream synchronization error is reported here
Thanks all! 😄 |
* @param hb host allocated buffer to copy data to | ||
* @param stream CUDA stream on which the device to host copy will be performed | ||
*-------------------------------------------------------------------------**/ | ||
void copy_to_host(const device_buffer& db, void* hb, cudaStream_t stream = 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There were linking errors downstream as this was not declared inline
and the function source code was not in a .cpp
file. This has since been fixed with PR ( #230 ).
Fixes #216
Adds the functionality to
rmm::device_buffer
andrmm.DeviceBuffer
to copy data back from device memory to host memory. On the C++ side this copies data back to a user allocated buffer. On the Python side this copies data into a Pythonbytes
object.