This repository has been archived by the owner on Mar 21, 2024. It is now read-only.
CUB 1.4.0
Summary
CUB 1.4.0 adds cub::DeviceSpmv
, cub::DeviceRunLength::NonTrivialRuns
, improves cub::DeviceHistogram
, and introduces support for SM5x (Maxwell) GPUs.
New Features:
cub::DeviceSpmv
methods for multiplying sparse matrices by dense vectors, load-balanced using a merge-based parallel decomposition.cub::DeviceRadixSort
sorting entry-points that always return the sorted output into the specified buffer, as opposed to thecub::DoubleBuffer
in which it could end up in either buffer.cub::DeviceRunLengthEncode::NonTrivialRuns
for finding the starting offsets and lengths of all non-trivial runs (i.e., length > 1) of keys in a given sequence. Useful for top-down partitioning algorithms like MSD sorting of very-large keys.
Other Enhancements
- Support and performance tuning for SM5x (Maxwell) GPUs.
- Updated cub::DeviceHistogram implementation that provides the same "histogram-even" and "histogram-range" functionality as IPP/NPP. Provides extremely fast and, perhaps more importantly, very uniform performance response across diverse real-world datasets, including pathological (homogeneous) sample distributions.