Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Belos CGSingleReduce: merge allreduces #4302

Merged
merged 3 commits into from
Feb 6, 2019

Conversation

cgcgcg
Copy link
Contributor

@cgcgcg cgcgcg commented Jan 30, 2019

@trilinos/belos

Description

The CG single reduce iteration performs two allreduces per iteration: one for the inner products and one for the convergence detection.
The proposed changes allow to merge these into a single allreduce per iteration. The options for that are:

  • Use the preconditioner norm $||P^{1/2}r||$ for convergence detection which is already computed anyways in the inner products. (This makes specifying a good tolerance potentially tricky.)
  • Compute the 2-norm at the same time as the inner products. (The implementation is not perfect, as it incurs more computation in its current form.)

The default behavior has not been changed.

…ce detection

This change allows using the norm induced by the preconditioner for
convergence detection. Since CG already computed this, we save on
all-reduce. On the flip-side, specifying tolerance wrt the
preconditioner norm might be more difficult.
@cgcgcg cgcgcg self-assigned this Jan 30, 2019
@bartlettroscoe bartlettroscoe added the stage: in progress Work on the issue has started label Jan 30, 2019
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2337
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 3e8fc4f
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2138
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 3e8fc4f
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 629
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 3e8fc4f
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 247
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 3e8fc4f
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Using Repos:

Repo: TRILINOS (cgcgcg/Trilinos)
  • Branch: belosMergeAllReduces
  • SHA: 3e8fc4f
  • Mode: TEST_REPO

Pull Request Author: cgcgcg

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2337
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 3e8fc4f
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2138
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 3e8fc4f
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 629
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 3e8fc4f
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 247
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 3e8fc4f
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9


CDash Test Results for PR# 4302.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
WARNING: NO REVIEWERS HAVE BEEN REQUESTED FOR THIS PULL REQUEST!

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@mhoemmen mhoemmen requested review from hkthorn and mhoemmen January 31, 2019 03:17
Copy link
Contributor

@mhoemmen mhoemmen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! A couple issues:

  1. Is there any reason you would ever not want this optimization? I don't even think it should be an option. Just always do this, if it makes sense (e.g., if useSingleReduction_ is true).

  2. There's no test that turns this option on. Since it's off by default, it will never get tested. (Fixing (1) will ensure this feature always gets tested, without you needing to do any work :-) .)

@@ -516,6 +521,9 @@ setParameters (const Teuchos::RCP<Teuchos::ParameterList> &params)
// Check if the user is requesting the single-reduction version of CG (only for blocksize == 1)
if (params->isParameter("Use Single Reduction")) {
useSingleReduction_ = params->get("Use Single Reduction", useSingleReduction_default_);
if (useSingleReduction_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if users don't enable "Use Single Reduction", but enable the new option? Does it just silently ignore the request?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it ever be possible to fold the convergence test into the all-reduce, if "Use Single Reduction" is true? If not, then Belos should report an error in this case.

@@ -370,6 +370,8 @@ namespace Belos {
static constexpr int verbosity_default_ = Belos::Errors;
static constexpr int outputStyle_default_ = Belos::General;
static constexpr int outputFreq_default_ = -1;
static constexpr const char * resNorm_default_ = "TwoNorm";
static constexpr bool foldConvergenceDetectionIntoAllreduce_default_ = false;
Copy link
Contributor

@mhoemmen mhoemmen Jan 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason why this should not default to true? Users won't know about this option and so they will always get the default. Could this ever not be a good idea? If it's always a good idea, then you should always do it. It should not even be an option. Just do the faster thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not change the default norm that is used for convergence detection, i.e. the 2-norm. Switching this to true saves one allreduce, but incurs more local computation. My guess is that this only pays off once the allreduce is expensive enough, so not for small communicator size.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cgcgcg Do you have a sense for the trade-off point? Most people won't use single-reduce CG unless they know what it is and care about the cost of all-reduces.

if (foldConvergenceDetectionIntoAllreduce_) {
T_ = MVT::Clone( *tmp, 2 );
// Z_ will view the first column of T_, R2_ will view the second.
std::vector<int> index(1,0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also use the overloads of CloneViewNonConst that take a Teuchos::Range1D.

@mhoemmen
Copy link
Contributor

Also, @iyamazaki wrote a Tpetra-specific implementation of single-reduce CG. You can get it from Belos::SolverFactory using the solver name "TPETRA CG SINGLE REDUCE". It would be very interesting to do a performance comparison. We do need some Tpetra-specific solvers, since MultiVecTraits doesn't and can't express the necessary linear algebra operations. However, it would be nicer to have a generic implementation of single-reduce CG.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
THE LAST COMMIT TO THIS PULL REQUEST HAS BEEN REVIEWED, BUT NOT ACCEPTED OR REQUIRES CHANGES

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@cgcgcg cgcgcg requested review from mhoemmen and hkthorn and removed request for hkthorn January 31, 2019 04:33
@cgcgcg cgcgcg force-pushed the belosMergeAllReduces branch from 3e8fc4f to f9b0665 Compare January 31, 2019 16:05
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2341
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA f9b0665
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2142
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA f9b0665
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 633
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA f9b0665
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 251
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA f9b0665
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Using Repos:

Repo: TRILINOS (cgcgcg/Trilinos)
  • Branch: belosMergeAllReduces
  • SHA: f9b0665
  • Mode: TEST_REPO

Pull Request Author: cgcgcg

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2341
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA f9b0665
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2142
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA f9b0665
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 633
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA f9b0665
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 251
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA f9b0665
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA caf3ca9


CDash Test Results for PR# 4302.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
THE LAST COMMIT TO THIS PULL REQUEST HAS NOT BEEN REVIEWED YET!

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@hkthorn
Copy link
Contributor

hkthorn commented Jan 31, 2019

@cgcgcg I am fine with the changes. I need for there to be a test added to ensure that this modification to the solver works and continues working.

@cgcgcg cgcgcg added the AT: WIP Causes the PR autotester to not test the PR. (Remove to allow testing to occur.) label Feb 1, 2019
This adds the option to combine the all-reduces for CG's inner
products and the convergence detection into one for
BelosCGSingleRedIter. The trade-off is more computation.
@cgcgcg cgcgcg force-pushed the belosMergeAllReduces branch from f9b0665 to 6116417 Compare February 1, 2019 23:20
@cgcgcg cgcgcg removed the AT: WIP Causes the PR autotester to not test the PR. (Remove to allow testing to occur.) label Feb 1, 2019
@cgcgcg cgcgcg force-pushed the belosMergeAllReduces branch from 6116417 to 7f25bb2 Compare February 1, 2019 23:21
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2354
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2155
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 646
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 268
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Using Repos:

Repo: TRILINOS (cgcgcg/Trilinos)
  • Branch: belosMergeAllReduces
  • SHA: 7f25bb2
  • Mode: TEST_REPO

Pull Request Author: cgcgcg

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2354
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2155
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 646
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 268
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba


CDash Test Results for PR# 4302.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
THE LAST COMMIT TO THIS PULL REQUEST HAS NOT BEEN REVIEWED YET!

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

1 similar comment
@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

Copy link
Contributor

@mhoemmen mhoemmen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the test! Just a few comments.

@@ -370,6 +370,8 @@ namespace Belos {
static constexpr int verbosity_default_ = Belos::Errors;
static constexpr int outputStyle_default_ = Belos::General;
static constexpr int outputFreq_default_ = -1;
static constexpr const char * resNorm_default_ = "TwoNorm";
static constexpr bool foldConvergenceDetectionIntoAllreduce_default_ = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cgcgcg Do you have a sense for the trade-off point? Most people won't use single-reduce CG unless they know what it is and care about the cost of all-reduces.

@@ -516,6 +521,9 @@ setParameters (const Teuchos::RCP<Teuchos::ParameterList> &params)
// Check if the user is requesting the single-reduction version of CG (only for blocksize == 1)
if (params->isParameter("Use Single Reduction")) {
useSingleReduction_ = params->get("Use Single Reduction", useSingleReduction_default_);
if (useSingleReduction_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it ever be possible to fold the convergence test into the all-reduce, if "Use Single Reduction" is true? If not, then Belos should report an error in this case.

// Get the norms of the residuals native to the solver.
template <class ScalarType, class MV, class OP>
Teuchos::RCP<const MV>
CGSingleRedIter<ScalarType,MV,OP>::getNativeResiduals( std::vector<MagnitudeType> *norms ) const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's technically possible to call this method with norms == nullptr, though I suppose the "iter" classes are just implementation details of the "SolMgr" classes.

//
MVT::MvAddMv( one, *cur_soln_vec, alpha, *P_, *cur_soln_vec );
lp_->updateSolution();
if (foldConvergenceDetectionIntoAllreduce_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a huge fan of the massive code duplication here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can certainly pull the if-statement into the loop. I think it's easier to read that way, since there are a couple of changes between the two paths..

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
THE LAST COMMIT TO THIS PULL REQUEST HAS BEEN REVIEWED, BUT NOT ACCEPTED OR REQUIRES CHANGES

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

2 similar comments
@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@trilinos-autotester
Copy link
Contributor

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

@cgcgcg
Copy link
Contributor Author

cgcgcg commented Feb 5, 2019

@mhoemmen @hkthorn I ran a comparison using the 2-norm for convergence detection, and two test problems that are in the MiniEM app: a mass matrix with diagonal preconditioning, and a second order differential operator with MG preconditioner. The break-even point appeared to be around 4000 ranks. One reason why the new code path is slow is that we require one extra preconditioner apply per solve and overconverge. (In principle this is not terrible. since users could change the tolerance to account for that.)
I think that leaving this disabled by default is the path to go, since the user really has to understand what is going to happen.

@hkthorn
Copy link
Contributor

hkthorn commented Feb 5, 2019

@cgcgcg Thanks for the information, it is very useful to know. There are pluses and minuses to every variation of the communication reducing variants.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ hkthorn ]!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

@jwillenbring jwillenbring added the AT: RETEST Causes the PR autotester to run a new round of PR tests on the next iteration label Feb 5, 2019
@jwillenbring
Copy link
Member

@cgcgcg

I applied the retest flag to this PR because we added a new PR test this afternoon that enables CUDA. Given the status of this PR, I hope this will not negatively impact your effort.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2372
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2173
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 664
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 289
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_cuda_9.2

  • Build Num: 76
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
JENKINS_JOB_TYPE Experimental
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Using Repos:

Repo: TRILINOS (cgcgcg/Trilinos)
  • Branch: belosMergeAllReduces
  • SHA: 7f25bb2
  • Mode: TEST_REPO

Pull Request Author: cgcgcg

@mhoemmen mhoemmen dismissed their stale review February 5, 2019 21:43

Heidi says OK; Christian explained his reasoning. Thanks!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 2372
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 2173
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL

  • Build Num: 664
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0

  • Build Num: 289
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba

Build Information

Test Name: Trilinos_pullrequest_cuda_9.2

  • Build Num: 76
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
JENKINS_JOB_TYPE Experimental
PULLREQUESTNUM 4302
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH belosMergeAllReduces
TRILINOS_SOURCE_REPO https://github.com/cgcgcg/Trilinos
TRILINOS_SOURCE_SHA 7f25bb2
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA e99a6ba


CDash Test Results for PR# 4302.

@trilinos-autotester trilinos-autotester removed the AT: RETEST Causes the PR autotester to run a new round of PR tests on the next iteration label Feb 6, 2019
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pkg: Belos stage: in progress Work on the issue has started
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants