Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify make_host_vector and make_device_uvector factories to optionally use pinned memory and kernel copy #16206

Merged
Merged
Show file tree
Hide file tree
Changes from 126 commits
Commits
Show all changes
129 commits
Select commit Hold shift + click to select a range
eb39019
remove pinned_host_vector
vuule May 30, 2024
24b1245
switch to host_device resource ref
vuule May 30, 2024
6c896f6
rebrand host memory resource
vuule May 31, 2024
0048c59
style
vuule May 31, 2024
1964523
java update because breaking
vuule May 31, 2024
f871ca0
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule May 31, 2024
ac0ce9c
java fix
vuule May 31, 2024
b610ba3
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule May 31, 2024
ab36162
move test out of io util
vuule May 31, 2024
69a1bce
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 3, 2024
83f665a
missed rename
vuule Jun 3, 2024
659cabc
Merge branch 'branch-24.08' into fea-pinned-vector-factory
vuule Jun 3, 2024
c1ae478
update benchmark changes
vuule Jun 3, 2024
b1a1582
Merge branch 'fea-pinned-vector-factory' of https://github.com/vuule/…
vuule Jun 3, 2024
707dfc7
Merge branch 'branch-24.08' into fea-pinned-vector-factory
vuule Jun 3, 2024
1c09d0c
rename rmm_host_vector
vuule Jun 4, 2024
c343c31
remove do_xyz
vuule Jun 4, 2024
25ddc4f
Merge branch 'fea-pinned-vector-factory' of https://github.com/vuule/…
vuule Jun 4, 2024
3fc988b
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 4, 2024
50f4d3e
comment
vuule Jun 4, 2024
8dfbd07
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 4, 2024
e429840
Merge branch 'fea-pinned-vector-factory' into fea-smart-copy
vuule Jun 4, 2024
e5af490
works
vuule Jun 5, 2024
9082ccc
include style
vuule Jun 5, 2024
054a98a
Merge branch 'branch-24.08' into fea-pinned-vector-factory
vuule Jun 5, 2024
17b1ee0
reviews
vuule Jun 6, 2024
e3c344b
Merge branch 'fea-pinned-vector-factory' of https://github.com/vuule/…
vuule Jun 6, 2024
ea6408f
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 6, 2024
2dbb68f
available_device_memory
vuule Jun 6, 2024
cb9cc22
reviews
vuule Jun 6, 2024
cf67a14
expand anon namespace
vuule Jun 6, 2024
24c1549
host_uvector
vuule Jun 7, 2024
9c97833
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 7, 2024
075deca
style
vuule Jun 7, 2024
164fce2
docs; prefixes
vuule Jun 7, 2024
b566bab
type aliases in host_uvector
vuule Jun 7, 2024
21edb53
refactor host_ticket
vuule Jun 7, 2024
3814797
style
vuule Jun 7, 2024
168609d
Merge branch 'fea-pinned-vector-factory' into fea-smart-copy
vuule Jun 10, 2024
3ef149d
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 10, 2024
c933157
style
vuule Jun 10, 2024
6784e07
more style
vuule Jun 10, 2024
a49789c
Merge branch 'branch-24.08' into fea-smart-copy
vuule Jun 10, 2024
ba06fbd
Merge branch 'fea-pinned-vector-factory' of https://github.com/vuule/…
vuule Jun 10, 2024
f7999aa
Revert "type aliases in host_uvector"
vuule Jun 10, 2024
c9a82d0
Revert "docs; prefixes"
vuule Jun 10, 2024
930efef
Revert "style"
vuule Jun 10, 2024
0466949
Revert "host_uvector"
vuule Jun 10, 2024
f312219
make do without host_uvector
vuule Jun 11, 2024
7cfee0a
missed change
vuule Jun 11, 2024
fe4d668
style
vuule Jun 11, 2024
52f4a96
Merge branch 'fea-pinned-vector-factory' into fea-smart-copy
vuule Jun 11, 2024
4c2b7cf
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 12, 2024
e2c8613
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 12, 2024
5a71f77
rename
vuule Jun 12, 2024
9068642
refactor
vuule Jun 12, 2024
2ec4670
missing newlines
vuule Jun 17, 2024
a886eb4
rename files
vuule Jun 17, 2024
0dae691
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 17, 2024
dd1fba8
Merge branch 'branch-24.08' into fea-smart-copy
vuule Jun 18, 2024
59ed0dd
Merge branch 'branch-24.08' into fea-smart-copy
vuule Jun 18, 2024
c6ef5f1
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 18, 2024
dcaeaba
test commit, please ignore
vuule Jun 18, 2024
e75808c
Merge branch 'fea-smart-copy' of https://github.com/vuule/cudf into f…
vuule Jun 18, 2024
d50f145
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 24, 2024
0a2742f
fix typo
vuule Jun 24, 2024
68a03f1
typeless API
vuule Jun 24, 2024
b63b393
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 24, 2024
336c7e0
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 25, 2024
1741037
sorthidth
vuule Jun 25, 2024
fff667b
simplify
vuule Jun 26, 2024
da2c009
Merge branch 'fea-smart-copy' of https://github.com/vuule/cudf into f…
vuule Jun 26, 2024
84683d2
another day, another threshold
vuule Jun 26, 2024
1bbd574
add missing break
vuule Jun 26, 2024
692f775
Merge branch 'branch-24.08' into fea-smart-copy
vuule Jun 26, 2024
101288f
rename files to host
vuule Jun 26, 2024
ce58c46
lines
vuule Jun 27, 2024
d897984
Merge branch 'branch-24.08' into fea-smart-copy
vuule Jun 27, 2024
3739c47
get_host_memory_resource
vuule Jun 27, 2024
49d65b8
use if/else
vuule Jun 27, 2024
84a1797
Merge branch 'fea-smart-copy' of https://github.com/vuule/cudf into f…
vuule Jun 27, 2024
0b2aa13
Merge branch 'branch-24.08' into fea-smart-copy
vuule Jun 27, 2024
5fd5d65
Merge branch 'fea-smart-copy' of https://github.com/vuule/cudf into f…
vuule Jun 27, 2024
db45aa7
rename back :D
vuule Jun 27, 2024
5a072cf
working make_host_vector
vuule Jun 27, 2024
6fcfec4
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 27, 2024
dd93448
auto
vuule Jun 28, 2024
095413e
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jun 28, 2024
02e7bfb
derive host_vector
vuule Jul 3, 2024
ef4e1de
use host_vector pt2
vuule Jul 3, 2024
1dbafa5
include changes
vuule Jul 5, 2024
58900dd
orc
vuule Jul 5, 2024
395898a
copying
vuule Jul 5, 2024
be916f9
few more
vuule Jul 5, 2024
2225e3b
partial IO
vuule Jul 5, 2024
0446d34
parquet
vuule Jul 5, 2024
6a7ff73
rest of it
vuule Jul 6, 2024
38cdd56
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jul 6, 2024
a0a6caa
style
vuule Jul 8, 2024
523ee4c
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jul 8, 2024
7789e39
improve docs
vuule Jul 9, 2024
ddf625c
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jul 9, 2024
d55fb39
add missing overload
vuule Jul 10, 2024
309ae34
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jul 11, 2024
d8f0e58
typo fixes; clean up
vuule Jul 11, 2024
60cc991
Merge branch 'branch-24.08' into fea-make_host_vector-great-again-try2
vuule Jul 11, 2024
7a7db99
Merge branch 'branch-24.08' into fea-make_host_vector-great-again-try2
vuule Jul 15, 2024
ffd54f9
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jul 15, 2024
b94d26c
fix return type
vuule Jul 15, 2024
0dfaee4
remove noexcept on deallocates
vuule Jul 15, 2024
eaea60d
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jul 17, 2024
66da001
tests
vuule Jul 17, 2024
bbf5f29
avoid copy_n
vuule Jul 17, 2024
7f3e7f5
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jul 17, 2024
036ac99
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
Jul 18, 2024
6e39c35
add is_device_accessible to span
vuule Jul 18, 2024
c262c30
pass host_span
vuule Jul 18, 2024
6cd16b5
address review
vuule Jul 18, 2024
9f51424
Merge branch 'branch-24.08' into fea-make_host_vector-great-again-try2
vuule Jul 18, 2024
044836a
reviews
vuule Jul 22, 2024
69e3895
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jul 22, 2024
32c7b72
review suggestion
vuule Jul 22, 2024
53fdbe4
Merge branch 'fea-make_host_vector-great-again-try2' of https://githu…
vuule Jul 22, 2024
cecb289
fix docs
vuule Jul 22, 2024
5d15a4d
revert to fix get_host_allocator
vuule Jul 22, 2024
f85759b
Merge branch 'branch-24.08' into fea-make_host_vector-great-again-try2
vuule Jul 22, 2024
de0bacc
Merge branch 'branch-24.08' of https://github.com/rapidsai/cudf into …
vuule Jul 22, 2024
0df153e
Merge branch 'fea-make_host_vector-great-again-try2' of https://githu…
vuule Jul 22, 2024
09b0ae2
Merge branch 'branch-24.08' into fea-make_host_vector-great-again-try2
vuule Jul 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -671,9 +671,9 @@ add_library(
src/unary/null_ops.cu
src/utilities/cuda_memcpy.cu
src/utilities/default_stream.cpp
src/utilities/host_memory.cpp
src/utilities/linked_column.cpp
src/utilities/logger.cpp
src/utilities/pinned_memory.cpp
src/utilities/prefetch.cpp
src/utilities/stacktrace.cpp
src/utilities/stream_pool.cpp
Expand Down
2 changes: 1 addition & 1 deletion cpp/include/cudf/detail/gather.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -577,7 +577,7 @@ void gather_bitmask(table_view const& source,
}

// Make device array of target bitmask pointers
std::vector<bitmask_type*> target_masks(target.size());
auto target_masks = make_host_vector<bitmask_type*>(target.size(), stream);
std::transform(target.begin(), target.end(), target_masks.begin(), [](auto const& col) {
return col->mutable_view().null_mask();
});
Expand Down
4 changes: 3 additions & 1 deletion cpp/include/cudf/detail/null_mask.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -430,7 +430,9 @@ std::vector<size_type> segmented_count_bits(bitmask_type const* bitmask,
if (num_segments == 0) { return std::vector<size_type>{}; }

// Construct a contiguous host buffer of indices and copy to device.
auto const h_indices = std::vector<size_type>(indices_begin, indices_end);
auto h_indices = make_empty_host_vector<typename std::iterator_traits<IndexIterator>::value_type>(
std::distance(indices_begin, indices_end), stream);
std::copy(indices_begin, indices_end, std::back_inserter(h_indices));
auto const d_indices =
make_device_uvector_async(h_indices, stream, rmm::mr::get_current_device_resource());

Expand Down
51 changes: 51 additions & 0 deletions cpp/include/cudf/detail/utilities/host_memory.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
/*
* Copyright (c) 2024, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#pragma once

#include <cudf/detail/utilities/host_vector.hpp>
#include <cudf/utilities/export.hpp>
#include <cudf/utilities/pinned_memory.hpp>

#include <rmm/resource_ref.hpp>

#include <cstddef>

namespace cudf::detail {
/**
* @brief Get the memory resource to be used for pageable memory allocations.
*
* @return Reference to the pageable memory resource
*/
CUDF_EXPORT rmm::host_async_resource_ref get_pageable_memory_resource();

/**
* @brief Get the allocator to be used for the host memory allocation.
*
* @param size The number of elements of type T to allocate
* @param stream The stream to use for the allocation
* @return The allocator to be used for the host memory allocation
*/
template <typename T>
rmm_host_allocator<T> get_host_allocator(std::size_t size, rmm::cuda_stream_view stream)
{
if (size * sizeof(T) <= get_allocate_host_as_pinned_threshold()) {
return {get_pinned_memory_resource(), stream};
}
return {get_pageable_memory_resource(), stream};
}

} // namespace cudf::detail
24 changes: 21 additions & 3 deletions cpp/include/cudf/detail/utilities/host_vector.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,10 @@ class rmm_host_allocator<void> {
};
};

template <class DesiredProperty, class... Properties>
inline constexpr bool contains_property =
(cuda::std::is_same_v<DesiredProperty, Properties> || ... || false);

/*! \p rmm_host_allocator is a CUDA-specific host memory allocator
* that employs \c `rmm::host_async_resource_ref` for allocation.
*
Expand Down Expand Up @@ -100,8 +104,12 @@ class rmm_host_allocator {
/**
* @brief Construct from a `cudf::host_async_resource_ref`
*/
rmm_host_allocator(rmm::host_async_resource_ref _mr, rmm::cuda_stream_view _stream)
: mr(_mr), stream(_stream)
template <class... Properties>
rmm_host_allocator(cuda::mr::async_resource_ref<cuda::mr::host_accessible, Properties...> _mr,
rmm::cuda_stream_view _stream)
: mr(_mr),
stream(_stream),
_is_device_accessible{contains_property<cuda::mr::device_accessible, Properties...>}
{
}

Expand Down Expand Up @@ -173,15 +181,25 @@ class rmm_host_allocator {
*/
inline bool operator!=(rmm_host_allocator const& x) const { return !operator==(x); }

bool is_device_accessible() const { return _is_device_accessible; }

private:
rmm::host_async_resource_ref mr;
rmm::cuda_stream_view stream;
bool _is_device_accessible;
};

/**
* @brief A vector class with rmm host memory allocator
*/
template <typename T>
using host_vector = thrust::host_vector<T, rmm_host_allocator<T>>;
class host_vector : public thrust::host_vector<T, rmm_host_allocator<T>> {
public:
using base = thrust::host_vector<T, rmm_host_allocator<T>>;

host_vector(rmm_host_allocator<T> const& alloc) : base(alloc) {}

host_vector(size_t size, rmm_host_allocator<T> const& alloc) : base(size, alloc) {}
Comment on lines +200 to +202
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a constructor with default allocator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should have one. The idea is to encourage the use of the factory functions because they transparently and consistently apply the pinned memory/kernel optimizations. I don't really want any code outside of the factory functions to call the constructors directly.

};

} // namespace cudf::detail
106 changes: 72 additions & 34 deletions cpp/include/cudf/detail/utilities/vector_factories.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@
* @file vector_factories.hpp
*/

#include <cudf/detail/utilities/cuda_memcpy.hpp>
#include <cudf/detail/utilities/host_memory.hpp>
#include <cudf/detail/utilities/host_vector.hpp>
#include <cudf/utilities/default_stream.hpp>
#include <cudf/utilities/error.hpp>
Expand All @@ -32,8 +34,6 @@
#include <rmm/mr/device/device_memory_resource.hpp>
#include <rmm/resource_ref.hpp>

#include <thrust/host_vector.h>

#include <vector>

namespace cudf {
Expand Down Expand Up @@ -100,11 +100,12 @@ rmm::device_uvector<T> make_device_uvector_async(host_span<T const> source_data,
rmm::device_async_resource_ref mr)
{
rmm::device_uvector<T> ret(source_data.size(), stream, mr);
CUDF_CUDA_TRY(cudaMemcpyAsync(ret.data(),
source_data.data(),
source_data.size() * sizeof(T),
cudaMemcpyDefault,
stream.value()));
auto const is_pinned = source_data.is_device_accessible();
cuda_memcpy_async(ret.data(),
source_data.data(),
source_data.size() * sizeof(T),
is_pinned ? host_memory_kind::PINNED : host_memory_kind::PAGEABLE,
stream);
return ret;
}

Expand Down Expand Up @@ -271,21 +272,11 @@ rmm::device_uvector<typename Container::value_type> make_device_uvector_sync(
return make_device_uvector_sync(device_span<typename Container::value_type const>{c}, stream, mr);
}

// Utility function template to allow copying to either a thrust::host_vector or std::vector
template <typename T, typename OutContainer>
OutContainer make_vector_async(device_span<T const> v, rmm::cuda_stream_view stream)
{
OutContainer result(v.size());
CUDF_CUDA_TRY(cudaMemcpyAsync(
result.data(), v.data(), v.size() * sizeof(T), cudaMemcpyDefault, stream.value()));
return result;
}

/**
* @brief Asynchronously construct a `std::vector` containing a copy of data from a
* `device_span`
*
* @note This function does not synchronize `stream`.
* @note This function does not synchronize `stream` after the copy.
*
* @tparam T The type of the data to copy
* @param source_data The device data to copy
Expand All @@ -295,14 +286,17 @@ OutContainer make_vector_async(device_span<T const> v, rmm::cuda_stream_view str
template <typename T>
std::vector<T> make_std_vector_async(device_span<T const> v, rmm::cuda_stream_view stream)
{
return make_vector_async<T, std::vector<T>>(v, stream);
std::vector<T> result(v.size());
CUDF_CUDA_TRY(cudaMemcpyAsync(
result.data(), v.data(), v.size() * sizeof(T), cudaMemcpyDefault, stream.value()));
return result;
}

/**
* @brief Asynchronously construct a `std::vector` containing a copy of data from a device
* container
*
* @note This function synchronizes `stream`.
* @note This function synchronizes `stream` after the copy.
*
* @tparam Container The type of the container to copy from
* @tparam T The type of the data to copy
Expand All @@ -324,7 +318,7 @@ std::vector<typename Container::value_type> make_std_vector_async(Container cons
* @brief Synchronously construct a `std::vector` containing a copy of data from a
* `device_span`
*
* @note This function does a synchronize on `stream`.
* @note This function does a synchronize on `stream` after the copy.
*
* @tparam T The type of the data to copy
* @param source_data The device data to copy
Expand Down Expand Up @@ -361,28 +355,71 @@ std::vector<typename Container::value_type> make_std_vector_sync(Container const
return make_std_vector_sync(device_span<typename Container::value_type const>{c}, stream);
}

/**
* @brief Construct a `cudf::detail::host_vector` of the given size.
*
* @note The returned vector may be using a pinned memory resource.
*
* @tparam T The type of the vector data
* @param size The number of elements in the created vector
* @param stream The stream on which to allocate memory
* @return A host_vector of the given size
*/
template <typename T>
host_vector<T> make_host_vector(size_t size, rmm::cuda_stream_view stream)
{
return host_vector<T>(size, get_host_allocator<T>(size, stream));
}

/**
* @brief Construct an empty `cudf::detail::host_vector` with the given capacity.
*
* @note The returned vector may be using a pinned memory resource.
*
* @tparam T The type of the vector data
* @param capacity Initial capacity of the vector
* @param stream The stream on which to allocate memory
* @return A host_vector with the given capacity
*/
template <typename T>
host_vector<T> make_empty_host_vector(size_t capacity, rmm::cuda_stream_view stream)
{
auto result = host_vector<T>(get_host_allocator<T>(capacity, stream));
result.reserve(capacity);
return result;
}

/**
* @brief Asynchronously construct a `thrust::host_vector` containing a copy of data from a
* `device_span`
*
* @note This function does not synchronize `stream`.
* @note This function does not synchronize `stream` after the copy. The returned vector may be
* using a pinned memory resource.
*
* @tparam T The type of the data to copy
* @param source_data The device data to copy
* @param stream The stream on which to perform the copy
* @return The data copied to the host
*/
template <typename T>
thrust::host_vector<T> make_host_vector_async(device_span<T const> v, rmm::cuda_stream_view stream)
host_vector<T> make_host_vector_async(device_span<T const> v, rmm::cuda_stream_view stream)
{
return make_vector_async<T, thrust::host_vector<T>>(v, stream);
auto result = make_host_vector<T>(v.size(), stream);
auto const is_pinned = result.get_allocator().is_device_accessible();
cuda_memcpy_async(result.data(),
v.data(),
v.size() * sizeof(T),
is_pinned ? host_memory_kind::PINNED : host_memory_kind::PAGEABLE,
stream);
return result;
}

/**
* @brief Asynchronously construct a `std::vector` containing a copy of data from a device
* container
*
* @note This function does not synchronize `stream`.
* @note This function does not synchronize `stream` after the copy. The returned vector may be
* using a pinned memory resource.
*
* @tparam Container The type of the container to copy from
* @tparam T The type of the data to copy
Expand All @@ -394,8 +431,8 @@ template <
typename Container,
std::enable_if_t<
std::is_convertible_v<Container, device_span<typename Container::value_type const>>>* = nullptr>
thrust::host_vector<typename Container::value_type> make_host_vector_async(
Container const& c, rmm::cuda_stream_view stream)
host_vector<typename Container::value_type> make_host_vector_async(Container const& c,
rmm::cuda_stream_view stream)
{
return make_host_vector_async(device_span<typename Container::value_type const>{c}, stream);
}
Expand All @@ -404,15 +441,16 @@ thrust::host_vector<typename Container::value_type> make_host_vector_async(
* @brief Synchronously construct a `thrust::host_vector` containing a copy of data from a
* `device_span`
*
* @note This function does a synchronize on `stream`.
* @note This function does a synchronize on `stream` after the copy. The returned vector may be
* using a pinned memory resource.
*
* @tparam T The type of the data to copy
* @param source_data The device data to copy
* @param stream The stream on which to perform the copy
* @return The data copied to the host
*/
template <typename T>
thrust::host_vector<T> make_host_vector_sync(device_span<T const> v, rmm::cuda_stream_view stream)
host_vector<T> make_host_vector_sync(device_span<T const> v, rmm::cuda_stream_view stream)
{
auto result = make_host_vector_async(v, stream);
stream.synchronize();
Expand All @@ -423,7 +461,7 @@ thrust::host_vector<T> make_host_vector_sync(device_span<T const> v, rmm::cuda_s
* @brief Synchronously construct a `thrust::host_vector` containing a copy of data from a device
* container
*
* @note This function synchronizes `stream`.
* @note This function synchronizes `stream` after the copy.
*
* @tparam Container The type of the container to copy from
* @tparam T The type of the data to copy
Expand All @@ -435,16 +473,16 @@ template <
typename Container,
std::enable_if_t<
std::is_convertible_v<Container, device_span<typename Container::value_type const>>>* = nullptr>
thrust::host_vector<typename Container::value_type> make_host_vector_sync(
Container const& c, rmm::cuda_stream_view stream)
host_vector<typename Container::value_type> make_host_vector_sync(Container const& c,
rmm::cuda_stream_view stream)
{
return make_host_vector_sync(device_span<typename Container::value_type const>{c}, stream);
}

/**
* @brief Asynchronously construct a pinned `cudf::detail::host_vector` of the given size
*
* @note This function may not synchronize `stream`.
* @note This function may not synchronize `stream` after the copy.
*
* @tparam T The type of the vector data
* @param size The number of elements in the created vector
Expand All @@ -460,7 +498,7 @@ host_vector<T> make_pinned_vector_async(size_t size, rmm::cuda_stream_view strea
/**
* @brief Synchronously construct a pinned `cudf::detail::host_vector` of the given size
*
* @note This function synchronizes `stream`.
* @note This function synchronizes `stream` after the copy.
*
* @tparam T The type of the vector data
* @param size The number of elements in the created vector
Expand Down
4 changes: 2 additions & 2 deletions cpp/include/cudf/io/text/detail/trie.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -223,11 +223,11 @@ struct trie {

match_length.emplace_back(0);

std::vector<trie_node> trie_nodes;
auto token_counts = std::unordered_map<cudf::size_type, int32_t>();
auto trie_nodes = cudf::detail::make_empty_host_vector<trie_node>(tokens.size(), stream);

for (uint32_t i = 0; i < tokens.size(); i++) {
trie_nodes.emplace_back(trie_node{tokens[i], match_length[i], transitions[i]});
trie_nodes.push_back(trie_node{tokens[i], match_length[i], transitions[i]});
token_counts[tokens[i]]++;
}

Expand Down
Loading
Loading