Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEACAS tests failing in ATDM builds on mutrino #2815

Closed
fryeguy52 opened this issue May 24, 2018 · 6 comments
Closed

SEACAS tests failing in ATDM builds on mutrino #2815

fryeguy52 opened this issue May 24, 2018 · 6 comments
Assignees
Labels
client: ATDM Any issue primarily impacting the ATDM project Disabled Tests Issue has been partially addressed by disabling *all* of the failing tests related to the issue PA: Data Services Issues that fall under the Trilinos Data Services Product Area pkg: seacas type: bug The primary issue is a bug in Trilinos code or tests

Comments

@fryeguy52
Copy link
Contributor

fryeguy52 commented May 24, 2018

CC: @bartlettroscoe, @trilinos/seacas

Next Action Status

Switching from 'salloc' to 'sbatch' eliminated all but one of the failing tests and the remaining failing test SEACASExodus_exodus_unit_tests was disabled in commit d12ca4e merged to 'develop' in PR #3011 on 6/26/2018.

Summary

The following tests are failing in the nightly jenkins jobs of the ATDM configuration on mutrino

SEACASAprepro_aprepro_command_line_include_test
SEACASAprepro_aprepro_command_line_vars_test
SEACASAprepro_aprepro_unit_test
SEACASAprepro_lib_aprepro_lib_array_test
SEACASAprepro_lib_aprepro_lib_unit_test
SEACASExodus_exodus_unit_tests

The list of failures can be seen here on CDash

They are failing in both
Trilinos-atdm-mutrino-intel-opt-openmp
Trilinos-atdm-mutrino-intel-debug-openmp

It looks like they all have something like this in the error output:

TEST_2

Running: "diff" "-w" "/lscratch1/jenkins/mutrino-slave/workspace/Trilinos-atdm-mutrino-intel-opt-openmp/SRC_AND_BUILD/Trilinos/packages/seacas/libraries/aprepro_lib/test_standard.out" "/lscratch1/jenkins/mutrino-slave/workspace/Trilinos-atdm-mutrino-intel-opt-openmp/SRC_AND_BUILD/BUILD/packages/seacas/libraries/aprepro_lib/test.output"

--------------------------------------------------------------------------------

diff: /lscratch1/jenkins/mutrino-slave/workspace/Trilinos-atdm-mutrino-intel-opt-openmp/SRC_AND_BUILD/BUILD/packages/seacas/libraries/aprepro_lib/test.output: No such file or directory

--------------------------------------------------------------------------------

TEST_2: Return code = 2
TEST_2: Pass criteria = Zero return code [FAILED]
TEST_2: Result = FAILED

Steps to Reproduce

on mutrino clone Trilinos and run the following. This will automatically set up the environment.

$ cd <some_build_dir>/

$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh intel-opt-openmp

$ cmake \
  -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
  -DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_SEACAS=ON \
  $TRILINOS_DIR

$ make -j16

$ salloc -N 1 -p standard -J $JOB_NAME ctest -j16

Environment

This is set up automatically in the instructions to reproduce above

@fryeguy52 fryeguy52 added type: bug The primary issue is a bug in Trilinos code or tests pkg: seacas client: ATDM Any issue primarily impacting the ATDM project labels May 24, 2018
@mhoemmen
Copy link
Contributor

@micahahoward

bartlettroscoe added a commit that referenced this issue Jun 15, 2018
Should help some with #2815 and perhaps other issues on 'mutrino'.  Also disables some tests with #2474.  Also relates to TRIL-200.  (The individual commits did not contain these Issue IDs.)
@bartlettroscoe
Copy link
Member

CC: @gsjaardema, @kddevin (Data Services Product Lead)

FYI: After the update to use 'sbatch' instead of 'salloc' on 'mutrino' 4 days ago (see PR #2964 and commit ff791f8), there is only one remaining failing SEACAS test since then which is SEACASExodus_exodus_unit_tests as shown in this query.
This test fails in both the Trilinos-atdm-mutrino-intel-debug-openmp and Trilinos-atdm-mutrino-intel-opt-openmp builds. In both of these builds the test output (for example shown here) shows output that starts with:

TEST_0

Running: "/bin/bash" "/lscratch1/jenkins/mutrino-slave/workspace/Trilinos-atdm-mutrino-intel-debug-openmp/SRC_AND_BUILD/BUILD/packages/seacas/libraries/exodus/test/testall"

  Writing output to file "/lscratch1/jenkins/mutrino-slave/workspace/Trilinos-atdm-mutrino-intel-debug-openmp/SRC_AND_BUILD/BUILD/packages/seacas/libraries/exodus/test/exodus_unit_tests.out"

--------------------------------------------------------------------------------

testwt - single precision write test...
Mon Jun 18 11:53:16 2018: [unset]:_pmi_alps_init:alps_get_placement_info returned with error -1
Mon Jun 18 11:53:16 2018: [unset]:_pmi_init:_pmi_alps_init returned -1
Exodus Library Warning/Error: [ex_put_name]
	ERROR: element block id 10 not found in file id 65536
...

and then ends with:

45a891,892
> Mon Jun 18 11:54:26 2018: [unset]:_pmi_alps_init:alps_get_placement_info returned with error -1
> Mon Jun 18 11:54:27 2018: [unset]:_pmi_init:_pmi_alps_init returned -1
46a894,895
> Mon Jun 18 11:54:30 2018: [unset]:_pmi_alps_init:alps_get_placement_info returned with error -1
> Mon Jun 18 11:54:30 2018: [unset]:_pmi_init:_pmi_alps_init returned -1

--------------------------------------------------------------------------------

TEST_1: Return code = 1
TEST_1: Pass criteria = Zero return code [FAILED]
TEST_1: Result = FAILED

================================================================================

OVERALL FINAL RESULT: TEST FAILED (SEACASExodus_exodus_unit_tests)

@gsjaardema, any idea what might be happening here?

Do you mind if we temporarily disable this test so that we can promote this build?

@bartlettroscoe
Copy link
Member

FYI: This appears to be the only platform with builds submitting to CDash where this test is failing out of 38 builds where this test is run as shown across all builds yesterday here. Therefore, disabling this test in this one build does not seem like it would result in the loss of that much testing (and this test is failing anyway so it is providing no value to the test suite on this platform).

@gsjaardema
Copy link
Contributor

The ex_put_name Warning/Error is expected from testwt. The other errors with [unset]:_pmi... are not expected. I am fine with disabling this test on this platform/build.

@bartlettroscoe
Copy link
Member

@gsjaardema said:

I am fine with disabling this test on this platform/build.

@fryeguy52, can you please put in a targeted disable for this test in the (new) files:

cmake/std/atdm/mutrino/tweaks/INTEL-RELEASE-OPENMP.cmake
cmake/std/atdm/mutrino/tweaks/INTEL-DEBUG-OPENMP.cmake

as mentioned in:

and:

and from looking at other examples?

NOTE: If that documentation is sufficient, let's work on improving it.

bartlettroscoe added a commit to bartlettroscoe/Trilinos that referenced this issue Jun 22, 2018
Strange errors.  Greg said to disable this test for now.
bartlettroscoe added a commit to bartlettroscoe/Trilinos that referenced this issue Jun 23, 2018
Strange errors.  Greg said to disable this test for now.
@bartlettroscoe
Copy link
Member

FYI: Test SEACASExodus_exodus_unit_tests disabled in commit d12ca4e merge to 'develop' in PR #3011 on 6/26/2018.

As evidence of this, you can see that this test was missing in the SEACAS tests on 'mutrino' yesterday in this query.

Marking this with "Disabled Tests" label.

@bartlettroscoe bartlettroscoe added the Disabled Tests Issue has been partially addressed by disabling *all* of the failing tests related to the issue label Jul 13, 2018
@bartlettroscoe bartlettroscoe added the PA: Data Services Issues that fall under the Trilinos Data Services Product Area label Nov 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client: ATDM Any issue primarily impacting the ATDM project Disabled Tests Issue has been partially addressed by disabling *all* of the failing tests related to the issue PA: Data Services Issues that fall under the Trilinos Data Services Product Area pkg: seacas type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

4 participants