Skip to content

Commit

Permalink
Merge Pull Request #12263 from vqd8a/Trilinos/ifpack2-riluk-kkspiluk-…
Browse files Browse the repository at this point in the history
…kksptrsv-use-streams-2

Automatically Merged using Trilinos Pull Request AutoTester
PR Title: b'Ifpack2: add option to use stream interfaces of KK SPILUK and KK SPTRSV in RILUK and local triangular solver'
PR Author: vqd8a
  • Loading branch information
trilinos-autotester authored Sep 19, 2023
2 parents 0b22aa0 + ea1b5c1 commit fc97faa
Show file tree
Hide file tree
Showing 10 changed files with 893 additions and 272 deletions.
24 changes: 24 additions & 0 deletions packages/ifpack2/doc/UsersGuide/options.tex
Original file line number Diff line number Diff line change
Expand Up @@ -478,6 +478,30 @@ \subsection{ILU($k$)}\label{s:ILU}
(not including contribution specified by {\tt "fact: absolute
threshold"}). Can be combined with {\tt "fact: absolute threshold"}.
The matrix remains unchanged.}
\ccc{fact: type}
{string}
{"Serial"}
{The RILUK factorization implementation type. Currently supports two types:
"Serial" (serial execution on the host) and "KSPILUK" (a Kokkos Kernels's
SpILUK on multi-core or GPU).}
\ccc{trisolver: type}
{string}
{"Internal"}
{The triangular solver type. Currently supports three solver types:
"Internal" (serial execution on the host), "HTS" (a ShyLU's sparse
triangular solver that uses OpenMP on the host), and "KSPTRSV" (a Kokkos
Kernels's SpTRSV on multi-core or GPU).}
\ccc{fact: kspiluk number-of-streams}
{int|global\_ordinal}
{0}
{Number of streams used by Kokkos Kernels's SpILUK and SpTRSV. When using
streams, the sub-domain on each MPI process is divided to diagonal blocks,
each of which is handled by SpILUK and SpTRSV on a stream. Because
information in off-diagonals of the sub-domain is ignored, it is expected
that iterative solvers take more interations to converge. However, since
these streams can run concurrently, the total time can be faster. When
this option is not set (i.e. not using stream), the entire sub-domain is
used instead.}
% All overlap-related code was removed by M. Hoemmen in
%
% commit 162f64572fbf93e2cac73e3034d76a3db918a494
Expand Down
16 changes: 16 additions & 0 deletions packages/ifpack2/src/Ifpack2_LocalSparseTriangularSolver_decl.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -340,6 +340,16 @@ class LocalSparseTriangularSolver :
/// then compute(), before you may call apply().
virtual void setMatrix (const Teuchos::RCP<const row_matrix_type>& A);

/// \brief Set this triangular solver's stream information.
///
void setStreamInfo (const bool& isKokkosKernelsStream, const int& num_streams, const std::vector<HandleExecSpace>& exec_space_instances);

/// \brief Set this preconditioner's matrices (used by stream interface of triangular solve).
///
/// After calling this method, you must call first initialize(),
/// then compute(), before you may call apply().
void setMatrices (const std::vector< Teuchos::RCP<crs_matrix_type> >& A_crs_v);

//@}

private:
Expand All @@ -349,6 +359,7 @@ class LocalSparseTriangularSolver :
Teuchos::RCP<Teuchos::FancyOStream> out_;
//! The original input matrix, as a Tpetra::CrsMatrix.
Teuchos::RCP<const crs_matrix_type> A_crs_;
std::vector< Teuchos::RCP<crs_matrix_type> > A_crs_v_;

typedef Tpetra::MultiVector<scalar_type, local_ordinal_type, global_ordinal_type, node_type> MV;
mutable Teuchos::RCP<MV> X_colMap_;
Expand Down Expand Up @@ -383,6 +394,11 @@ class LocalSparseTriangularSolver :
//! Optional KokkosKernels implementation.
bool isKokkosKernelsSptrsv_;
Teuchos::RCP<k_handle> kh_;
std::vector< Teuchos::RCP<k_handle> > kh_v_;
int num_streams_;
bool isKokkosKernelsStream_;
bool kh_v_nonnull_;
std::vector<HandleExecSpace> exec_space_instances_;

/// \brief "L" if the matrix is locally lower triangular, "U" if the
/// matrix is locally upper triangular, or "N" if unknown or
Expand Down
Loading

0 comments on commit fc97faa

Please sign in to comment.