Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MueLu: Add CuSparse support to matvec driver #4672

Merged

Conversation

jjellio
Copy link
Contributor

@jjellio jjellio commented Mar 20, 2019

@trilinos/muelu

Description

Provides CUSPARSE (TPL) access in the MatvecDriver

Also corrects a bug in MueLu's ETI helper ... which surely needs to be put through the PR tester.
The MueLu Helper was not including the MueLu ExplicitInstantiation header (generated), which caused the logic that creates --node=cuda|openmp|... to fail

Motivation and Context

Allow side by side comparisons against CUSPARSE

How Has This Been Tested?

Tested on Waterman

Checklist

  • My commit messages mention the appropriate GitHub issue numbers.
  • My code follows the code style of the affected package(s).
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the code contribution guidelines for this project.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • No new compiler warnings were introduced.
  • These changes break backwards compatibility.

@cgcgcg @csiefer2

jjellio added 5 commits March 19, 2019 23:06
The ExplicitInstantiation header was not being included, which
made all --node commands fail only the default could work,
because it didn't rely on MueLu's ETI macros
This provides HAVE_MUELU_CUSPARSE if the TPL is enabled.
Provide output for error norms and commandline flag to control
@jjellio jjellio requested review from cgcgcg and csiefer2 March 20, 2019 05:12
@jjellio jjellio requested a review from a team as a code owner March 20, 2019 05:12
@jjellio jjellio self-assigned this Mar 20, 2019
@jjellio jjellio added the AT: AUTOMERGE Causes the PR autotester to automatically merge the PR branch once approvals are completed label Mar 20, 2019
@jjellio jjellio removed the request for review from a team March 20, 2019 05:13
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2888
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4672
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH jje/muelu-matvec-cusparse
TRILINOS_SOURCE_REPO https://github.com/jjellio/Trilinos
TRILINOS_SOURCE_SHA b9ed5eb
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e96cb74

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2715
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4672
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH jje/muelu-matvec-cusparse
TRILINOS_SOURCE_REPO https://github.com/jjellio/Trilinos
TRILINOS_SOURCE_SHA b9ed5eb
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e96cb74

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 1167
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4672
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH jje/muelu-matvec-cusparse
TRILINOS_SOURCE_REPO https://github.com/jjellio/Trilinos
TRILINOS_SOURCE_SHA b9ed5eb
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e96cb74

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 855
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4672
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH jje/muelu-matvec-cusparse
TRILINOS_SOURCE_REPO https://github.com/jjellio/Trilinos
TRILINOS_SOURCE_SHA b9ed5eb
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e96cb74

Build Information

Test Name: Trilinos_pullrequest_cuda_9.2

  • Build Num: 590
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
JENKINS_JOB_TYPE Experimental
PULLREQUESTNUM 4672
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH jje/muelu-matvec-cusparse
TRILINOS_SOURCE_REPO https://github.com/jjellio/Trilinos
TRILINOS_SOURCE_SHA b9ed5eb
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e96cb74

Using Repos:

Repo: TRILINOS (jjellio/Trilinos)
  • Branch: jje/muelu-matvec-cusparse
  • SHA: b9ed5eb
  • Mode: TEST_REPO

Pull Request Author: jjellio

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2888
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4672
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH jje/muelu-matvec-cusparse
TRILINOS_SOURCE_REPO https://github.com/jjellio/Trilinos
TRILINOS_SOURCE_SHA b9ed5eb
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e96cb74

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2715
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4672
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH jje/muelu-matvec-cusparse
TRILINOS_SOURCE_REPO https://github.com/jjellio/Trilinos
TRILINOS_SOURCE_SHA b9ed5eb
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e96cb74

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 1167
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4672
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH jje/muelu-matvec-cusparse
TRILINOS_SOURCE_REPO https://github.com/jjellio/Trilinos
TRILINOS_SOURCE_SHA b9ed5eb
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e96cb74

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 855
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4672
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH jje/muelu-matvec-cusparse
TRILINOS_SOURCE_REPO https://github.com/jjellio/Trilinos
TRILINOS_SOURCE_SHA b9ed5eb
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e96cb74

Build Information

Test Name: Trilinos_pullrequest_cuda_9.2

  • Build Num: 590
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
JENKINS_JOB_TYPE Experimental
PULLREQUESTNUM 4672
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH jje/muelu-matvec-cusparse
TRILINOS_SOURCE_REPO https://github.com/jjellio/Trilinos
TRILINOS_SOURCE_SHA b9ed5eb
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e96cb74


CDash Test Results for PR# 4672.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
NO REVIEWS HAVE BEEN PERFORMED ON THIS PULL REQUEST!

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

// const double *beta,
// double *y)
cusparseStatus_t rc;
if(Kokkos::Impl::is_same<Scalar,double>::value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trilinos is only tested with C++11 now, so please use std::is_same -- thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of stuff actually goes away. I made this into a class, which allows things like specializations (as seen below). Doing that removes the need to do this all together.


auto X_lcl = X.template getLocalView<device_type> ();
auto Y_lcl = Y.template getLocalView<device_type> ();
x = (Scalar*) X_lcl.data();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use reinterpret_cast instead of C-style casts -- thanks!

m = (int) A.getNodeNumRows();
n = (int) A.getNodeNumCols();
nnz = (int) Acolind_cusparse.extent(0);
vals = (Scalar*) Avals.data();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use reinterpret_cast instead of C-style casts -- thanks!

Arowptr_cusparse = cusparse_int_type("Arowptr", Arowptr.extent(0));
Acolind_cusparse = cusparse_int_type("Acolind", Acolind.extent(0));
// copy the ordinals into the local view (type conversion)
copy_view(Arowptr,Arowptr_cusparse);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please use the same convention as Kokkos::deep_copy for which argument is the target of the deep copy.
  2. See Tpetra_Details_copyOffsets.hpp for an implementation of this functionality that checks for overflow. This code assumes Tpetra anyway, so you can just include that header file and use the function.
  3. You're copying, so you could allocate without fill.
  4. Acolind is always a View of int anyway, so consider just using it directly.

vector_type& Y)
{}

~CuSparse_SpmV_Pack() {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please write ~CuSparse_SpmV_Pack() = default; instead.

cublasStatus_t cublasStatus;
cublasStatus = cublasCreate(&cublasHandle);

//checkCudaErrors(cublasStatus);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could just use the usual CUDA macro here that checks the return value of these functions for you.

cusparseStatus_t spmv(const Scalar alpha, const Scalar beta) { return CUSPARSE_STATUS_SUCCESS; }
};

template<typename LocalOrdinal, typename GlobalOrdinal>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to use Scalar instead of double throughout this class, why not just template the class on Scalar and do a partial specialization for Scalar=double?

@jjellio
Copy link
Contributor Author

jjellio commented Mar 20, 2019

I wonder if I broke pre-merge inspection, when I accidentally clicked 'project' rather than 'label' and put this into to the MueLu 'project'. I realized the folly of my ways and deleted that. Doing this marked 'MueLu' as a reviewer, which wasn't my intent. #4679

@csiefer2
Copy link
Member

@jjellio Not your fault. As per @jwillenbring the per-merge inspection, is done via the same queueing system as the PR tester. Basically, we have to wait until our "lane" clears. If this hasn't inspected by tomorrow AM, @jwillenbring will take a look and see what broke in the inspection.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ csiefer2 ]!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester trilinos-autotester merged commit b3351b0 into trilinos:develop Mar 20, 2019
@trilinos-autotester
Copy link
Contributor

Merge on Pull Request# 4672: IS A SUCCESS - Pull Request successfully merged

@trilinos-autotester trilinos-autotester removed the AT: AUTOMERGE Causes the PR autotester to automatically merge the PR branch once approvals are completed label Mar 20, 2019
@csiefer2
Copy link
Member

@jwillenbring All good, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants