Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm
linear-algebra mpi cuda scalapack matrix-multiplication gpu-acceleration rocm matmul communication-optimal pdgemm
-
Updated
Apr 2, 2025 - C++