CUDA
CUDA fine-grained implementation of the NSGA-II algortihm.
A project demonstrating how to use the libs of cuPCL.
An efficient C++17 GPU numerical computing library with Python-like syntax
CudaPAD is a PTX/SASS viewer for NVIDIA Cuda kernels and provides an on-the-fly view of the assembly.
CUDA Python: Performance meets Productivity
A structure from motion implemention in C++ and accelerated using CUDA
🍟 Massively parallel DBSCAN algorithm implemented in CUDA along with a KD-Tree for searching neighbors.
cuCIM - RAPIDS GPU-accelerated image processing library
A CUDA implementation of Bundle Adjustment
Nvdiffrast - Modular Primitives for High-Performance Differentiable Rendering
[ARCHIVED] The C++ Standard Library for your entire system. See https://github.com/NVIDIA/cccl
A high performance CUDA implementation of Scan Matching via the Iterative Closest Point Algorithm
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
A CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)
A portable high-level API with CUDA or OpenCL back-end
Thin, unified, C++-flavored wrappers for the CUDA APIs
Bolt is a C++ template library optimized for GPUs. Bolt provides high-performance library implementations for common algorithms such as scan, reduce, transform, and sort.
A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).
Compute the exact Euclidean Distance Transform and Voronoi Diagram for 2D and 3D binary images using the GPU.