Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add method to copy device_buffer back to host memory #219

Merged
merged 37 commits into from
Jan 7, 2020

Conversation

jakirkham
Copy link
Member

Fixes #216

Adds the functionality to rmm::device_buffer and rmm.DeviceBuffer to copy data back from device memory to host memory. On the C++ side this copies data back to a user allocated buffer. On the Python side this copies data into a Python bytes object.

This allows users to copy an `rmm::device_buffer` back to host memory
given a pointer to the destination. It is up to the user to ensure they
have allocated enough space to perform the copy and that the pointer is
in fact on the host.
Make sure that we are able to use the `copy_to_host` method from Cython.
Allocate a `bytes` object that is uninitialized and then copy the data
from the underlying `rmm::device_buffer` into the pointer held within
`bytes`.
include/rmm/device_buffer.hpp Outdated Show resolved Hide resolved
Copy link
Contributor

@kkraus14 kkraus14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python side looks good. Would love to be able to use Pinned Host memory in the future but that's a future optimization and will likely involve some weird Python workarounds.

@kkraus14 kkraus14 added the 3 - Ready for review Ready for review by team label Jan 2, 2020
Copy link
Contributor

@kkraus14 kkraus14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, @jakirkham would you mind adding some unit tests for this on the Python side?

include/rmm/device_buffer.hpp Outdated Show resolved Hide resolved
include/rmm/device_buffer.hpp Show resolved Hide resolved
include/rmm/device_buffer.hpp Outdated Show resolved Hide resolved
Copy link
Contributor

@kkraus14 kkraus14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python / Cython look good

As we need to perform synchronization at the Cython level, ensure we can
call `cudaStreamSynchronize` by externing it.
@jakirkham jakirkham force-pushed the add_devbuf_tobytes branch 2 times, most recently from 0597262 to f57934f Compare January 7, 2020 19:09
To ensure we have a valid `bytes` object with data in it, synchronize at
the Cython level. Besides we no longer synchronize in C++. So we cannot
rely on that any more.
Instead of taking a `cudaStream_t` typed `stream` value in `tobytes`,
take a `uintptr_t` and cast it to `cudaStream_t` later. This way we can
leverage Cython's knowledge of how to coerce Python objects to
`uintptr_t`. Then it is a simple matter to cast the `uintptr_t` to
`cudaStream_t` later.
@jakirkham
Copy link
Member Author

Sorry for the churn. Think this is ready for another review @jrhemstad. Also @kkraus14 you may want to take another look as the Python side has changed a bit (though not dramatically).

@jakirkham
Copy link
Member Author

rerun tests

(Conda package failed to download from server in one build)

@jakirkham
Copy link
Member Author

jakirkham commented Jan 7, 2020

rerun tests

(Only the pr-builder job failed, but all the builds actually passed)

@jakirkham
Copy link
Member Author

rerun tests

(auto-complete 🙄)

Copy link
Contributor

@kkraus14 kkraus14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor issue with how a stream synchronization error is reported here

@kkraus14 kkraus14 merged commit 5d3c17f into rapidsai:branch-0.12 Jan 7, 2020
@jakirkham jakirkham deleted the add_devbuf_tobytes branch January 7, 2020 23:03
@jakirkham
Copy link
Member Author

Thanks all! 😄

* @param hb host allocated buffer to copy data to
* @param stream CUDA stream on which the device to host copy will be performed
*-------------------------------------------------------------------------**/
void copy_to_host(const device_buffer& db, void* hb, cudaStream_t stream = 0) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were linking errors downstream as this was not declared inline and the function source code was not in a .cpp file. This has since been fixed with PR ( #230 ).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for review Ready for review by team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Method to copy device_buffer to host
4 participants