Skip to content

Commit

Permalink
Fused block jacobi
Browse files Browse the repository at this point in the history
More performant path for block Jacobi case inside BTDS
(GPU only, BlockCrs only). Fuses residual and solve
into one kernel and doesn't convert vectors to SIMD-packed
format. Also inverts diag blocks fully in shared to speed up numeric.

Signed-off-by: Brian Kelley <[email protected]>
  • Loading branch information
brian-kelley committed Feb 26, 2025
1 parent 8eca3f9 commit 1d16296
Show file tree
Hide file tree
Showing 4 changed files with 1,173 additions and 104 deletions.
Loading

0 comments on commit 1d16296

Please sign in to comment.