Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Amesos2: Fix single-proc case for non-contiguous GIDs #4454

Merged
merged 2 commits into from
Feb 28, 2019

Conversation

ndellingwood
Copy link
Contributor

Potential fix for issue #3647

@ndellingwood ndellingwood added the AT: WIP Causes the PR autotester to not test the PR. (Remove to allow testing to occur.) label Feb 21, 2019
@ndellingwood
Copy link
Contributor Author

@vbrunini this PR fixed the single proc case with KLU2 I used while working on the issue, labeled WIP until I cleanup the test. If you'd like to try it before merge please pull this PR; your code will require setting a parameter in the parameter list to indicate to Amesos2 that the row ids are "gapped" or non-contiguous, here is sample syntax:

std::string solverName = "KLU2";
Teuchos::ParameterList amesos2_params("Amesos2");
amesos2_params.sublist(solverName).set("IsContiguous", false, "Are GIDs Contiguous");

@ndellingwood
Copy link
Contributor Author

@vbrunini here's some updates on this PR:

In the recent commit I pushed an example Amesos2_GappedMtxGIDs-1proc (and small example matrices) to the Trilinos/packages/amesos2/example directory. The example uses some parts from Mark's PR #3649 to allow for reading in matrix market files with gapped row ids. By default, for now, the example uses a small matrix (1-proc) test case though you can pass the 'badMatrix.mm', 'badRhs.mm', and 'rowMap.mm' files as arguments and it's hard-coded to do the right things to test with those matrices. Later I'll push the 'badMatrix' etc files when I'm confident I have them marked correctly for release.

A couple comments about the example:

To run with non-default matrices, you can execute something like this:
./Amesos2_GappedMtxGIDs-1proc.exe --matrixFilename="<DATADIRPATH>/badMatrix.mm" --rhsFilename="<DATADIRPATH>/badRhs.mm" --mapFilename="<DATADIRPATH>/rowMap.mm"

The example is hard-coded to detect "badMatrix" in the matrixFilename string and set some internal variables to handle this case (reading matrix market files and maps with gapped row ids may require extra info depending on how the map was written e.g. index base 1 vs 0, knowing how many lines are in the header file).

By default a parameter list is created for the solver that sets "IsContiguous" to "false", indicating to Amesos2 that the matrix contains gaps in the row ids and to accommodate for this. This can be disabled with the command line argument --isContiguous. To add this functionality to your codes, add lines like the following:

    Teuchos::ParameterList amesos2_params("Amesos2");
    amesos2_params.sublist(solverName).set("IsContiguous", false, "Are GIDs Contiguous");
    solver->setParameters( Teuchos::rcpFromRef(amesos2_params) );

It looks like it's important the solverName matches the solver closely which we need to ease (there is more leniency regarding caps when creating the solver). For the solvers listed in the issue, solverName should be:
std::string solverName="KLU2";
std::string solverName="SuperLU";
std::string solverName="SuperLU_DIST";
std::string solverName="Umfpack";

I tested this example with KLU2 and SuperLU 4.3 using the badMatrix etc. files and I get correct results (within tolerance via residual check); I had issues building SuperLU_DIST TPL locally and need to test on a machine with SEMS modules next.

Prior to this example, I wrote some separate driver code to read the badMatrix, badRhs and rowMap files into corresponding Tpetra data structures, convert them to have locally contiguous GIDs and then write to matrix market files. I was then able to read those files into Octave and compute a solution as a sanity check that the matrix itself was solvable, and the results also agreed well with the solvers used in Amesos2. If it's useful I can clean up the driver codes or files and send them to you.

Hopefully the updates from this PR allow your code to work with KLU2 or SuperLU (I'll post updates once I finish testing with SLUDIST); if the changes here don't resolve the issues you are seeing it may help if we can follow up offline to figure out what extra info beyond the data files I'll need to reproduce and resolve.

@ndellingwood
Copy link
Contributor Author

@mhoemmen I used some of your matrix reader code for gapped-row-id cases from your example PR #3649 in the example I added to this PR, thanks for providing that (very helpful!) and let me know how I should properly credit you.

@ndellingwood ndellingwood added AT: AUTOMERGE Causes the PR autotester to automatically merge the PR branch once approvals are completed and removed AT: WIP Causes the PR autotester to not test the PR. (Remove to allow testing to occur.) labels Feb 27, 2019
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2652
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4454
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH issue-3647
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA ca24a5a
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a2030b7

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2449
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4454
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH issue-3647
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA ca24a5a
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a2030b7

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 940
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4454
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH issue-3647
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA ca24a5a
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a2030b7

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 615
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4454
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH issue-3647
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA ca24a5a
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a2030b7

Build Information

Test Name: Trilinos_pullrequest_cuda_9.2

  • Build Num: 366
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
JENKINS_JOB_TYPE Experimental
PULLREQUESTNUM 4454
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH issue-3647
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA ca24a5a
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a2030b7

Using Repos:

Repo: TRILINOS (trilinos/Trilinos)
  • Branch: issue-3647
  • SHA: ca24a5a
  • Mode: TEST_REPO

Pull Request Author: ndellingwood

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2652
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4454
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH issue-3647
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA ca24a5a
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a2030b7

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2449
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4454
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH issue-3647
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA ca24a5a
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a2030b7

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 940
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4454
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH issue-3647
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA ca24a5a
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a2030b7

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 615
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4454
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH issue-3647
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA ca24a5a
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a2030b7

Build Information

Test Name: Trilinos_pullrequest_cuda_9.2

  • Build Num: 366
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
JENKINS_JOB_TYPE Experimental
PULLREQUESTNUM 4454
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH issue-3647
TRILINOS_SOURCE_REPO https://github.com/trilinos/Trilinos
TRILINOS_SOURCE_SHA ca24a5a
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a2030b7


CDash Test Results for PR# 4454.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
NO REVIEWS HAVE BEEN PERFORMED ON THIS PULL REQUEST!

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@ndellingwood
Copy link
Contributor Author

@srajama1 if you get a break during the conference can you review?

Copy link
Contributor

@srajama1 srajama1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this change doesn't fix the problem originally reported. I assume this is an unrelated problem you would like to be fixed anyway, so you are taking care of it. If yes, please go ahead.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ srajama1 ]!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester trilinos-autotester merged commit 4bb7591 into develop Feb 28, 2019
@trilinos-autotester
Copy link
Contributor

Merge on Pull Request# 4454: IS A SUCCESS - Pull Request successfully merged

@trilinos-autotester trilinos-autotester removed the AT: AUTOMERGE Causes the PR autotester to automatically merge the PR branch once approvals are completed label Feb 28, 2019
@ndellingwood
Copy link
Contributor Author

@srajama1 it fixes the problem for the matrix, rhs, and map provided.

@ndellingwood
Copy link
Contributor Author

ndellingwood commented Feb 28, 2019

@srajama1 we'll continue to followup if the provided files aren't a sufficient proxy for the problems reported in #3647, so that issue should stay open for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants