-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add device_uvector::reserve and device_buffer::reserve #1079
Add device_uvector::reserve and device_buffer::reserve #1079
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a reasonable addition. One comment / question about making it a little cleaner.
include/rmm/device_buffer.hpp
Outdated
auto tmp = device_buffer{new_size, stream, _mr}; | ||
RMM_CUDA_TRY( | ||
cudaMemcpyAsync(new_data, data(), size(), cudaMemcpyDefault, this->stream().value())); | ||
deallocate_async(); | ||
_data = new_data; | ||
_size = new_size; | ||
_capacity = new_size; | ||
cudaMemcpyAsync(tmp.data(), data(), size(), cudaMemcpyDefault, this->stream().value())); | ||
std::swap(tmp, *this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could have sworn I had a reason for not using CAS here originally, but now I'm struggling to remember what it could have been.
Oh well ¯_(ツ)_/¯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work correctly without adding/specializing a swap function for device_buffer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also have this question. From what I can tell std::swap
should be safe, but I may be missing something. Also is this "CAS"? There's no comparison, just a swap, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also is this "CAS"?
Sorry, overloaded acronym. "copy and swap"
Does this work correctly without adding/specializing a swap function for device_buffer?
Yeah, the default implementation of swap
will just use the move ctor: https://stackoverflow.com/a/25286610/11341974
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I read the link about the behavior of swap, and that all seems fine. But could we just move-assign the new buffer to this instance instead of swapping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be fine as well. No need to move around the old buffer ;) Changed it to std::move
include/rmm/device_buffer.hpp
Outdated
auto tmp = device_buffer{new_size, stream, _mr}; | ||
RMM_CUDA_TRY( | ||
cudaMemcpyAsync(new_data, data(), size(), cudaMemcpyDefault, this->stream().value())); | ||
deallocate_async(); | ||
_data = new_data; | ||
_size = new_size; | ||
_capacity = new_size; | ||
cudaMemcpyAsync(tmp.data(), data(), size(), cudaMemcpyDefault, this->stream().value())); | ||
std::swap(tmp, *this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also have this question. From what I can tell std::swap
should be safe, but I may be missing something. Also is this "CAS"? There's no comparison, just a swap, right?
auto tmp = device_buffer{new_capacity, stream, _mr}; | ||
auto const old_size = size(); | ||
RMM_CUDA_TRY(cudaMemcpyAsync(tmp.data(), data(), size(), cudaMemcpyDefault, stream.value())); | ||
*this = std::move(tmp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you just move tmp
over *this
, is the old memory of this
properly deallocated? I want to ensure there is no memory leak. With swap
, it's obvious to me there is no leak because tmp
is on the stack and when it goes out of scope it will be destroyed, taking the old memory of this
with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the old memory of this properly deallocated?
Yep, it will invoke the move assignment operator of device_buffer
which will deallocate the original memory.
rmm/include/rmm/device_buffer.hpp
Lines 194 to 223 in be9b9a9
/** | |
* @brief Move assignment operator moves the contents from `other`. | |
* | |
* This `device_buffer`'s current device memory allocation will be deallocated | |
* on `stream()`. | |
* | |
* If a different stream is required, call `set_stream()` on | |
* the instance before assignment. After assignment, this instance's stream is | |
* replaced by the `other.stream()`. | |
* | |
* @param other The `device_buffer` whose contents will be moved. | |
*/ | |
device_buffer& operator=(device_buffer&& other) noexcept | |
{ | |
if (&other != this) { | |
deallocate_async(); | |
_data = other._data; | |
_size = other._size; | |
_capacity = other._capacity; | |
set_stream(other.stream()); | |
_mr = other._mr; | |
other._data = nullptr; | |
other._size = 0; | |
other._capacity = 0; | |
other.set_stream(cuda_stream_view{}); | |
} | |
return *this; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the move is safe. However I have a question about streams. Should the old buffer be destroyed on the same stream as the copy (to ensure the copy is complete), or the stream the old buffer was constructed with (current behavior)? Should the copy always occur on the stream used to construct the original buffer to ensure that the reserve sequences after the constructor’s allocation? (I don’t remember seeing an explicit sync but need to look again.) Is this solved by the call to set_stream
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this solved by the call to set_stream?
Yes. Everything here is happening all on the provided stream
argument.
It would be an error to construct a device_buffer
on s1
and reserve on s2
without first doing a synchronization between s1
and s2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic. I wanted to be sure I understood that correctly. Approving now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks all. Sounds right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reservations from me! Thanks @upsj!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @upsj !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @upsj !
@gpucibot merge |
I am building a parser that outputs variable-sized blocks of data. To collect them, I would like to use pre-allocated
device_uvector
s, usingsize()
to keep track of how much memory is already in use. Setting the capacity and size manually works at the moment by callingvec.resize(capacity, stream); vec.resize(size, stream);
on an empty vector, but this seems unnecessarily complicated. Sincedevice_uvector
otherwise already closely matches thestd::vector
interface, I want to propose addingreserve
to the interface.TODO: