-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zoltan2: Fixes for #6440 #6514
Zoltan2: Fixes for #6440 #6514
Conversation
This will resolve an indeterminancy in the MJ calculation. Mj tests can now detect indeterminate results and return an error. Cdash showed a seg fault in one case and it's not clear if that is related. But this will at least get the test running in a consistent manner and should clear most of the random failures taking down auto PR testing.
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
// the last member is utility used for atomically inserting the values. | ||
// Sorting here avoids potential indeterminancy in the partitioning results | ||
auto track_on_cuts_sort = Kokkos::subview(track_on_cuts, | ||
std::pair<mj_lno_t, mj_lno_t>(0, track_on_cuts.size() - 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kokkos::subview
takes a half-exclusive range (start, end+1)
. Also, what if track_on_cuts.size()
is zero?
@mhoemmen Thanks. This array is always size 1 or greater and the last element is used for index tracking so not sorted. Though I could add a check to skip the sort when size = 1 and no sorting is necessary. However I've temporarily closed the PR as the bug just stopped replicating for me in develop. I need to investigate this further. |
The optimizations don't work properly with OpenMP due to the statics and caused the randomly generated coords to change run to run. Since this code is just for the tests it is not performance critical.
Leave view at size 0 if not used at all. Skip sort call when view is not used.
9d6a145
to
62d4cab
Compare
@kddevin PR fixes indeterminate result causing the MultiVector pass and BasicVectorAdapter pass to give different results on the cuts and fail the comparison part of the test. I've also fixed another issue in the GeometricGenerator random number generation. The statics were not handled in a thread safe way and the original coordinate distribution was changing run to run for OpenMP. I think performance for GeometricGenerator is not very important since it's for tests only so I've simplified this and just removed the statics. |
Thanks, @MicheldeMessieres. |
Status Flag 'Pre-Test Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ kddevin ]! |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
Using Repos:
Pull Request Author: MicheldeMessieres |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ kddevin ]! |
Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR... |
This will resolve indeterminate behavior in the Multijagged calculation.
This is expected to resolve most of the cases where 6440 failed. However in one case, cdash showed a segfault and it's not clear yet why that would occur or if it would be impacted by this fix. This will at least get the test running in a consistent manner.
@trilinos/zoltan2
Motivation
Eliminates indeterminate behavior. Will fix most but perhaps not all intermittent failures reported in #6440.
Stakeholder Feedback
Testing