Stretching GPU performance for GEMMs and tensor contractions.
python machine-learning amd gpu assembly opencl dnn matrix-multiplication neural-networks gpu-acceleration blas hip gpu-computing tensors tensor-contraction gemm radeon auto-tuning
-
Updated
Feb 2, 2025 - Python