Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
More performant path for block Jacobi case inside BTDS (GPU only, BlockCrs only). Fuses residual and solve into one kernel and doesn't convert vectors to SIMD-packed format. Also inverts diag blocks fully in shared to speed up numeric. Signed-off-by: Brian Kelley <[email protected]>
- Loading branch information