-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable randomly failing Tpetra test (#10847), remove FORCE from all test disables (#10870) #10871
Disable randomly failing Tpetra test (#10847), remove FORCE from all test disables (#10870) #10871
Conversation
You *NEVER* want to force disable a test. If you force disable it, the developer has no way to manually enable that test in order to fix it.
… just on ascicgpu PR build (#10847)
FYI: I merged this branch into the topic branch for PR #10808 so we can see how this works in practice. |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_PR_gcc-8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-7.2.0-serial
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-7.2.0-debug
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_intel-17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_clang-10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-10.1.243
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_python3
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-11.4.2-uvm-off
Jenkins Parameters
Using Repos:
Pull Request Author: bartlettroscoe |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: Trilinos_PR_gcc-8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-7.2.0-serial
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-7.2.0-debug
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_intel-17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_clang-10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-10.1.243
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_python3
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-11.4.2-uvm-off
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
FYI: I merged this branch into the branch in PR #10808 so this will get tested the next time that PR build runs. NOTE: Trilinos is not testing updates to these files in PR testing. That is not good! |
NOTE: A PR test iteration for PR #10808 on 'ascicgpu050` has started and we see in the configure output here for the build PR-10808-test-rhel7_sems-cuda-11.4.2-sems-gnu-10.1.0-sems-openmpi-4.0.5_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables-407 showing:
So that test will not run in that CUDA-11 build. But looking at the configure output [here](https://trilinos-cdash.sandia.gov/build/378206/configure for the CUDA build on 'vortex' in the build PR-10808-test-ats2_cuda-10.1.243-gnu-8.3.1-spmpi-rolling_release_static_Volta70_Power9_no-asan_no-complex_no-fpic_mpi_pt_no-rdc_uvm_deprecated-on_no-package-enables-887 we see:
So that is proof that test |
Hello @jhux2, @csiefer2, and @cgcgcg, can one of you please approve this PR so that it can merge? This will remove this test from taking out PR builds due to its random failing for the CUDA-11 build on the ascicgpu machines but keeps the test running in all other builds, including the CUDA build vortex (see above). |
The impossible seems to have occurred. The test TpetraTSQR_SequentialTsqr_contiguousCacheBlocks_MPI_1 was reported to have run and filed in the build PR-10808-test-rhel7_sems-cuda-11.4.2-sems-gnu-10.1.0-sems-openmpi-4.0.5_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables-407 but that test is shown disabled in the configure output showing:
I will try to reproduce this on the machine and see what happens. |
For the Trilinos version from PR #10808
using the files: load-env-and-cmake-frag-file.sh
do-configure
I did the following configuration and build on 'ascicgp17' just now:
In other words, there has to be something wrong with the PR tester to be posting test results including the test Looking at the Jenkins logs for this build at: the file: when downloaded does not even show that test:
What in the check is going on here? |
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
Wait, I missed that the failing test described above is I will update issue issue #10847 to see what other tests are showing this same error "an illegal instruction was encountered". |
…ust on ascicgpu PR build (#10847)
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: Trilinos_PR_gcc-8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-7.2.0-serial
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-7.2.0-debug
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_intel-17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_clang-10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-10.1.243
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_python3
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-11.4.2-uvm-off
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
Status Flag 'Pull Request AutoTester' - User Requested Retest - Label AT: RETEST will be reset after testing. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_PR_gcc-8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-7.2.0-serial
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-7.2.0-debug
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_intel-17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_clang-10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-10.1.243
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_python3
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-11.4.2-uvm-off
Jenkins Parameters
Using Repos:
Pull Request Author: bartlettroscoe |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: Trilinos_PR_gcc-8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-7.2.0-serial
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_gcc-7.2.0-debug
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_intel-17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_clang-10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-10.1.243
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_python3
Jenkins Parameters
Build InformationTest Name: Trilinos_PR_cuda-11.4.2-uvm-off
Jenkins Parameters
|
All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the FORCE
option since this will be required during incremental rebuilds and is what cmake -DVAR=VAL
does.
Let's discuss this. |
Having the in that if the developer passed in @e10harvey, please document the use cases where you see a problem needing to be solved by using the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this just affects cuda UVM off builds, let's make a note of the FORCE option removal here and see if people start running into problems.
Thanks @e10harvey! I promise that as long as the PR tester and users run |
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ jhux2 e10harvey ]! |
Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged |
Merge on Pull Request# 10871: IS A SUCCESS - Pull Request successfully merged |
CC: @trilinos/tpetra
Internal Issues
Description
This PR disables the randomly failing tests
TpetraTSQR_SequentialTsqr_[non]contiguousCacheBlocks_MPI_1
that is taking out all kinds of PR iterations (see #10847). The PR also removes 'FORCE' from all test disables (see #10870).How was this tested?
Using the scripts:
load-env-and-cmake-frag-file.sh:
and:
do-configure:
I configured on 'ascicgpu17' with:
and the configure output showed: