Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update NOX_Thyra_Eq test to clearly show why it passes or fails #2310

Conversation

bartlettroscoe
Copy link
Member

CC: @trilinos/nox, @fryeguy52

This updates the test NOX_Thyra_Heq so that it clearly shows why the test passes or fails. This currently still fails for the Intel 17.4 compiler but at least now we can see how this test is behaving on differnet platforms.

Before, the test just showed:

************************************************************************
-- Nonlinear Solver Step 16 -- 
||F|| = 6.500e-08  step = 1.000e+00 (Converged!)
************************************************************************

************************************************************************
-- Final Status Test Results --
Converged....OR Combination -> 
  **...........Finite Number Check (Two-Norm F) = Finite
  Converged....AND Combination -> 
    Converged....F-Norm = 6.749e-09 < 1.000e-08
                 (Length-Scaled Two-Norm, Absolute Tolerance)
    Converged....WRMS-Norm = 5.011e-05 < 1
  ??...........Number of Iterations = -1 < 100
************************************************************************
Test failed!

which makes no sense because it says it converged!

But there was a silent check that it must solve in 18 iterations. But with Intel 17.4, it converges in 16 iterations (assuming due to better rounding with this Intel compiler). Yea!

This updates the test to print the real pass/fail criteria and I updated the criteria to all allow num interations between (14, 18) and it now prints:

************************************************************************
-- Nonlinear Solver Step 16 -- 
||F|| = 6.500e-08  step = 1.000e+00 (Converged!)
************************************************************************

************************************************************************
-- Final Status Test Results --
Converged....OR Combination -> 
  **...........Finite Number Check (Two-Norm F) = Finite
  Converged....AND Combination -> 
    Converged....F-Norm = 3.250e-09 < 1.000e-08
                 (Length-Scaled Two-Norm, Absolute Tolerance)
    Converged....WRMS-Norm = 2.810e-05 < 1
  ??...........Number of Iterations = -1 < 100
************************************************************************

Check for test pass/fail:
solvStatus = Converged.... == Converged.... : passed
numIterations = 16 == 18 : FAILED
diff->norm() = 1.4614e-07 <= 1e-14 : FAILED

Test failed!

I also improve the outputting by using Teuchos::VerboseObjectBase::getDefaultOStream() so you don't need to bother checking what proc you are on when you print.

This was need for clear output for the test in NOX Thyra_Heq.C.

I just added a simple usage of this macro.  But we really need better unit
tests for all of thse macros.
Before, if any of the three criteria failed, it would jsut print "Test
failed".  But now it prints why it failed with details.

This test currently fails with Intel compilers (see trilinos#2247) but at least now it
shows you why (which was not clear at all before).

I also made usage of the default FancyOStream to avoid logic about what
process you are on for when you should be printing or not.  That is the best
way to handle parallel output and better test output control.
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: Trilinos_autotester_test

  • Build Num: 442
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 2310
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH 2247-fix-nox-thyra-heq-for-intel
TRILINOS_SOURCE_REPO https://github.com/bartlettroscoe/Trilinos
TRILINOS_SOURCE_SHA d3192a1
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 2797a9a

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3

  • Build Num: 307
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.9.3
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 2310
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH 2247-fix-nox-thyra-heq-for-intel
TRILINOS_SOURCE_REPO https://github.com/bartlettroscoe/Trilinos
TRILINOS_SOURCE_SHA d3192a1
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 2797a9a

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 18
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 2310
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH 2247-fix-nox-thyra-heq-for-intel
TRILINOS_SOURCE_REPO https://github.com/bartlettroscoe/Trilinos
TRILINOS_SOURCE_SHA d3192a1
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 2797a9a

Using Repos:

Repo: TRILINOS (bartlettroscoe/Trilinos)
  • Branch: 2247-fix-nox-thyra-heq-for-intel
  • SHA: d3192a1
  • Mode: TEST_REPO

Pull Request Author: bartlettroscoe

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED

Note: Testing will normally be attempted again in approx. 4:00:00. If a change to the Pull Request source branch occurs, the testing will be attempted again.

Pull Request Auto Testing has FAILED (click to expand)

Build Information

Test Name: Trilinos_autotester_test

  • Build Num: 442
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 2310
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH 2247-fix-nox-thyra-heq-for-intel
TRILINOS_SOURCE_REPO https://github.com/bartlettroscoe/Trilinos
TRILINOS_SOURCE_SHA d3192a1
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 2797a9a

Build Information

Test Name: Trilinos_pullrequest_gcc_4.9.3

  • Build Num: 307
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.9.3
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 2310
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH 2247-fix-nox-thyra-heq-for-intel
TRILINOS_SOURCE_REPO https://github.com/bartlettroscoe/Trilinos
TRILINOS_SOURCE_SHA d3192a1
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 2797a9a

Build Information

Test Name: Trilinos_pullrequest_gcc_4.8.4

  • Build Num: 18
  • Status: FAILED

Jenkins Parameters

Parameter Name Value
COMPILER_MODULE sems-gcc/4.8.4
JENKINS_BUILD_TYPE Release
JENKINS_COMM_TYPE MPI
JENKINS_DO_COMPLEX OFF
JENKINS_JOB_TYPE Experimental
MPI_MODULE sems-openmpi/1.8.7
PULLREQUESTNUM 2310
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH 2247-fix-nox-thyra-heq-for-intel
TRILINOS_SOURCE_REPO https://github.com/bartlettroscoe/Trilinos
TRILINOS_SOURCE_SHA d3192a1
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA 2797a9a

@bartlettroscoe
Copy link
Member Author

@trilinos/framework, @allevin,

Why did the above build Trilinos_pullrequest_gcc_4.8.4 fail?

I will run checkin-test-sems.sh locally to see what happens.

@jwillenbring
Copy link
Member

@bmpersc @allevin

This build is not showing up on the dashboard:

The end of the console output in Jenkins simply says:

Error(s) when configuring the project
CMake Error at /scratch/trilinos/workspace/trilinos-folder/Trilinos_pullrequest_gcc_4.8.4/TFW_testing_single_configure_prototype/simple_testing.cmake:159 (message):
Configure failed with error -1

Single configure/build/test failed. The error code was: 255
Build step 'Execute shell' marked build as failure
Finished: FAILURE

@bmpersc
Copy link
Contributor

bmpersc commented Feb 28, 2018

This build is supposed to submit to the dashboard, but it didn't. The configure step failed, but even without the submission the ctest temporary files should have given some indication. However, they are no where to be found in the build tree.

The script failed to submit because the error check and abort for the configure happens before a submit. I will fix it so the next run should submit the failing configure to the dashboard as expected.

@bartlettroscoe
Copy link
Member Author

I merged this branch locally and tested with:

$ ./checkin-test-sems.sh --do-all

and it fully passed as shown below.

@atoth1, can you please review and merge this branch? Once this is merged to develop, it will result in various machines and compilers running with the extra pass/fail criteria and then we can see what this is doing on various machines.

CHECKIN TEST RESULTS (Click to expand)
From: Roscoe A Bartlett [mailto:[email protected]]
Sent: Wednesday, February 28, 2018 4:20 PM
To: Bartlett, Roscoe A <[email protected]>
Subject: READY TO PUSH: Trilinos: crf450.srn.sandia.gov

READY TO PUSH: Trilinos: crf450.srn.sandia.gov

Wed Feb 28 14:20:25 MST 2018

Enabled Packages: NOX, TeuchosCore
Disabled Packages: PyTrilinos,Claps,TriKota Enabled all Forward Packages

Build test results:
-------------------
0) MPI_RELEASE_DEBUG_SHARED_PT => passed: passed=2494,notpassed=0
(102.46 min)

*** Commits for repo :
  3f66559 Merge remote-tracking branch 'rab-github/2247-fix-nox-thyra-heq-
for-intel' into de..
  d3192a1 Update to clealry show why the test passes or fails (#2247)
  0862b7a Add [TEUCHOS_]TEST_COMPARE_CONST() macros (#2247)

0) MPI_RELEASE_DEBUG_SHARED_PT Results:
---------------------------------------

  passed: Trilinos/MPI_RELEASE_DEBUG_SHARED_PT: passed=2494,notpassed=0

  Wed Feb 28 14:20:24 MST 2018

  Enabled Packages: NOX, TeuchosCore
  Disabled Packages: PyTrilinos,Claps,TriKota
  Enabled all Forward Packages
  Hostname: crf450.srn.sandia.gov
  Source Dir:
/home/rabartl/Trilinos.base/Trilinos/cmake/tribits/ci_support/../../..
  Build Dir:
/home/rabartl/Trilinos.base/BUILDS/CHECKIN/MPI_RELEASE_DEBUG_SHARED_
PT

  CMake Cache Varibles: -
DTrilinos_TRIBITS_DIR:PATH=/home/rabartl/Trilinos.base/Trilinos/cmake/tribits
-DTrilinos_ENABLE_TESTS:BOOL=ON -
DTrilinos_TEST_CATEGORIES:STRING=BASIC -
DTrilinos_ALLOW_NO_PACKAGES:BOOL=OFF -
DDART_TESTING_TIMEOUT:STRING=300.0 -DBUILD_SHARED_LIBS=ON -
DTrilinos_DISABLE_ENABLED_FORWARD_DEP_PACKAGES=ON -
DTrilinos_ENABLE_SECONDARY_TESTED_CODE:BOOL=OFF -
DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/MpiReleaseDebugSha
redPtSettings.cmake,cmake/std/BasicCiTestingSettings.cmake -
DZoltan2_OrderingScotch_MPI_4_DISABLE=TRUE -
DTrilinos_ENABLE_NOX:BOOL=ON -DTrilinos_ENABLE_TeuchosCore:BOOL=ON
-DTrilinos_ENABLE_ALL_OPTIONAL_PACKAGES:BOOL=ON -
DTrilinos_ENABLE_ALL_FORWARD_DEP_PACKAGES:BOOL=ON -
DTrilinos_ENABLE_PyTrilinos:BOOL=OFF -DTrilinos_ENABLE_Claps:BOOL=OFF -
DTrilinos_ENABLE_TriKota:BOOL=OFF
  Make Options: -j16
  CTest Options: -j16

  Pull: Passed (0.00 min)
  Configure: Passed (2.66 min)
  Build: Passed (85.90 min)
  Test: Passed (13.89 min)

  100% tests passed, 0 tests failed out of 2494

  Label Time Summary:
  Amesos               =  20.38 sec (14 tests)
  Amesos2              =  10.11 sec (8 tests)
  Anasazi              = 104.69 sec (71 tests)
  AztecOO              =  17.21 sec (17 tests)
  Belos                = 100.99 sec (72 tests)
  Domi                 = 150.63 sec (125 tests)
  Epetra               =  49.39 sec (61 tests)
  EpetraExt            =  13.51 sec (10 tests)
  FEI                  =  45.48 sec (43 tests)
  Galeri               =   5.77 sec (9 tests)
  GlobiPack            =   2.30 sec (6 tests)
  Ifpack               =  60.48 sec (53 tests)
  Ifpack2              =  42.45 sec (35 tests)
  Intrepid             = 198.24 sec (152 tests)
  Intrepid2            =  74.12 sec (144 tests)
  Isorropia            =   8.32 sec (6 tests)
  ML                   =  47.23 sec (34 tests)
  MiniTensor           =   0.77 sec (2 tests)
  MueLu                = 302.65 sec (83 tests)
  NOX                  = 148.29 sec (106 tests)
  OptiPack             =   6.12 sec (5 tests)
  Panzer               = 558.62 sec (154 tests)
  Phalanx              =  12.43 sec (27 tests)
  Pike                 =   4.57 sec (7 tests)
  Piro                 =  26.88 sec (12 tests)
  ROL                  = 746.38 sec (151 tests)
  RTOp                 =  13.80 sec (24 tests)
  Rythmos              = 153.61 sec (83 tests)
  STK                  =  27.49 sec (12 tests)
  Sacado               =  79.04 sec (292 tests)
  Shards               =   1.34 sec (4 tests)
  Stokhos              =  97.48 sec (75 tests)
  Stratimikos          =  36.10 sec (40 tests)
  Teko                 =  73.85 sec (19 tests)
  Tempus               = 2173.81 sec (35 tests)
  Teuchos              =  78.14 sec (137 tests)
  Thyra                =  64.53 sec (81 tests)
  Tpetra               = 161.15 sec (149 tests)
  TrilinosCouplings    =  53.02 sec (24 tests)
  Triutils             =   2.53 sec (2 tests)
  Xpetra               =  43.73 sec (18 tests)
  Zoltan2              = 141.63 sec (100 tests)

  Total Test time (real) = 831.89 sec

  Total time for MPI_RELEASE_DEBUG_SHARED_PT = 102.46 min

@atoth1
Copy link
Contributor

atoth1 commented Mar 1, 2018

Looks good, but I don't think I have the permissions to perform the merge.

@bartlettroscoe
Copy link
Member Author

Looks good, but I don't think I have the permissions to perform the merge.

Okay, I will click "Merge" then. Thanks @atoth1!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client: ATDM Any issue primarily impacting the ATDM project pkg: NOX
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants