-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REVIEW] New stream-ordered suballocator resources (PTDS support) #449
Conversation
…y_resource interface.
…locator_memory_resource
Reviewers: I could use feedback on whether my use of move semantics is correct and consistent (and where I missed it). There are some other new (to me) C++ things in here that I used, like emplacing in a |
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
1 similar comment
Please update the changelog in order to start CI tests. View the gpuCI docs here. |
…everything in hybrid.
Fixed. But now I'm wondering if using CRTP rather than virtual functions would be faster. It's really too bad profiling CPU code is so hard. Edit: the more I think about it CRTP makes a lot of sense for these classes (runtime polymorphism isn't really needed), but that should be a followup. |
I removed the virtual inheritance in favor of CRTP in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This overall looks great. Love the use of algorithms and const
:)
include/rmm/mr/device/detail/stream_ordered_memory_resource.hpp
Outdated
Show resolved
Hide resolved
include/rmm/mr/device/detail/stream_ordered_memory_resource.hpp
Outdated
Show resolved
Hide resolved
include/rmm/mr/device/detail/stream_ordered_memory_resource.hpp
Outdated
Show resolved
Hide resolved
include/rmm/mr/device/detail/stream_ordered_memory_resource.hpp
Outdated
Show resolved
Hide resolved
include/rmm/mr/device/detail/stream_ordered_memory_resource.hpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shared_ptr
is a much nicer solution to the lifetime of the events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
struct split_block { | ||
void* allocated_pointer; ///< The pointer allocated from a block | ||
block_type remainder; ///< The remainder of the block from which the pointer was allocated | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ 👍
This PR is bigger than originally planned, but it achieves quite a lot.
free_list
abstract interface, and two implementations:fixed_size_free_list
(for fixed-size block allocators) andcoalescing_free_list
(for coalescing pool allocators).stream_ordered_suballocator_memory_resource
fixed_size_memory_resource
andpool_memory_resource
in terms of the above classes.thread_safe_resource_adapter
.TODO:
fixed_multisize_memory_resource
is thread safe without the adapter.Update:
fixed_multisize_memory_resource
andhybrid_memory_resource
allocate/deallocate
functions are thread safe as long as their upstream memory resources are thread safe. All currently available upstream resources in RMM are thread-safe after this PR. Therefore there is no longer a current need forthread_safe_resource_adapter
.