-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tpetra: fix race condition in transfers with AbsMax for CUDA_AWARE_MPI #3999
Tpetra: fix race condition in transfers with AbsMax for CUDA_AWARE_MPI #3999
Conversation
While these are not necessary to fix trilinos#3968, this cleans up some old and probably not so safe code. Mainly switching to using dual view semantics instead of the get 1d and 2d arrays and adding fences.
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -805,7 +810,7 @@ DOFManager<LO,GO>::buildGlobalUnknowns_GUN(const Tpetra::MultiVector<GO,LO,GO,pa | |||
|
|||
// do a prefix sum | |||
GO scanResult = 0; | |||
Teuchos::scan<int, GO> (*getComm(), Teuchos::REDUCE_SUM, static_cast<size_t> (localsum), Teuchos::outArg (scanResult)); | |||
Teuchos::scan<int, GO> (*getComm(), Teuchos::REDUCE_SUM, static_cast<int> (localsum), Teuchos::outArg (scanResult)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only helpful for removing warnings if GO
is signed. You really want to cast localsum
to GO
since that's the type of the scan. It will get cast from int
to GO
here anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! will change
@@ -930,7 +938,10 @@ DOFManager<LO,GO>::buildTaggedMultiVector(const ElementBlockAccess & ownedAccess | |||
|
|||
// temporary working vector to fill each row in tagged array | |||
std::vector<int> working(overlap_mv->getNumVectors()); | |||
ArrayRCP<ArrayRCP<GO> > edittwoview = overlap_mv->get2dViewNonConst(); | |||
auto dual_view = overlap_mv->getDualView(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer that you not call getDualView
. Please call the functions that get the Kokkos::View of the appropriate memory space directly. This is fine for now, but I do plan on getting rid of getDualView
at some point, since it exposes an implementation detail that constrains Tpetra to have allocations on both sides at all times (it can't do lazy allocation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. Will fix that as well.
auto dual_view = overlap_mv->getDualView(); | ||
dual_view.sync_host(); | ||
PHX::Device::fence(); | ||
auto edittwoview_host = dual_view.view_host(); | ||
for (size_t b = 0; b < blockOrder_.size(); ++b) { | ||
// there has to be a field pattern assocaited with the block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"associated"
@@ -1304,7 +1322,10 @@ fillGIDsFromOverlappedMV(const ElementBlockAccess & access, | |||
using Teuchos::ArrayRCP; | |||
|
|||
//To generate elementGIDs we need to go through all of the local elements. | |||
ArrayRCP<ArrayRCP<const GO> > twoview = overlap_mv.get2dView(); | |||
auto dual_view = overlap_mv.getDualView(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above. Get Kokkos::View objects from the Tpetra::MultiVector directly; don't access the DualView.
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Using Repos:
Pull Request Author: rppawlo |
@mhoemmen - I'm going to remove the automerge for now since I need to push the changes you requested. |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Console Output (last 100 lines) : Trilinos_pullrequest_intel_17.0.1 # 1795 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_4.9.3 # 2358 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_4.8.4 # 2018 (click to expand)
Console Output (last 100 lines) : Trilinos_pullrequest_gcc_4.9.3_SERIAL # 327 (click to expand)
|
@mhoemmen - changes have been pushed. Could you review and reapprove? |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Using Repos:
Pull Request Author: rppawlo |
auto mv_size = values.extent(0); | ||
Kokkos::parallel_reduce(mv_size,panzer::dof_functors::SumRank2<GO,KV>(values),localsum); | ||
Kokkos::parallel_reduce(mv_size,panzer::dof_functors::SumRank2<GO,decltype(values)>(values),localsum); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future, please consider using an explicit RangePolicy instead of putting the dimension as the first argument of Kokkos::parallel_reduce
. This would let you execute on the execution_space
corresponding to Tpetra's data, rather than on Kokkos' default execution space. Tpetra's default execution space could differ from Kokkos' default execution space. (Some people really want Serial for Tpetra but CUDA for everything else.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do. Looks like this functor was written in 2015 - before execution policies even existed :)
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ mhoemmen ]! |
Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged |
Merge on Pull Request# 3999: IS A SUCCESS - Pull Request successfully merged |
@trilinos/tpetra This was originally part of PR trilinos#3996. @rppawlo pushed an improved fix to trilinos#3968 first in PR trilinos#3999, which got merged. I thus deleted PR trilinos#3996. This commit includes just the unit test that was part of PR trilinos#3996, but not part of PR trilinos#3999.
#3968
Changes to each package are in separate commits.
Thanks to Mark Hoemmen for the tpetra fix!