Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panzer: CurlLaplacianExample-ConvTest randomly failing #271

Closed
rppawlo opened this issue Apr 6, 2016 · 11 comments
Closed

Panzer: CurlLaplacianExample-ConvTest randomly failing #271

rppawlo opened this issue Apr 6, 2016 · 11 comments
Assignees

Comments

@rppawlo
Copy link
Contributor

rppawlo commented Apr 6, 2016

On my workstation, this test occasionally fails. Probably a race condition on the output check.

@rppawlo rppawlo self-assigned this Apr 6, 2016
@rppawlo
Copy link
Contributor Author

rppawlo commented Apr 6, 2016

adding @eric-c-cyr

@rppawlo
Copy link
Contributor Author

rppawlo commented May 12, 2016

Looks like it's a race condition in the parsing of output. Below is the test output:

opening "MPE-ConvTest-05"
Test Failed:
invalid literal for float(): 0.00970975ALL PASSED: EpetraALL PASSED: Epetra


@rppawlo rppawlo added the stage: ready The issue is ready to be worked in a Kanban-like process label Nov 11, 2016
@rppawlo
Copy link
Contributor Author

rppawlo commented Nov 11, 2016

We can't rely on screen output for checking test convergence when multiple MPI processes are writing at the same time. We need to change this to write to a file and have the convergence tester read that file. Since Panzer will be changed from ST to PT in Trilinos soon this needs to be fixed soon.

@eric-c-cyr @jmgate

@jmgate
Copy link
Contributor

jmgate commented Nov 11, 2016

Oh good, I was seeing this test randomly fail too but could never reproduce the error when I tried so I thought I was just losing my mind :)

Forgive my ignorance, but what are ST and PT?

@trilinos/panzer

@mhoemmen
Copy link
Contributor

@rppawlo wrote:

We can't rely on screen output for checking test convergence when multiple MPI processes are writing at the same time. We need to change this to write to a file and have the convergence tester read that file.

You may find Tpetra::Details::gathervPrint handy. It lives in tpetra/core/src/Tpetra_Details_gathervPrint.hpp. It takes a string on each MPI process in a communicator, and outputs the result of each in turn (deterministically) to an output stream only valid on Process 0 in the communicator. It's also memory scalable :-D

@mhoemmen
Copy link
Contributor

mhoemmen commented Nov 11, 2016

@jmgate wrote:

Forgive my ignorance, but what are ST and PT?

PT ("primary tested") is a label for packages that Trilinos builds and tests by default. In the check-in test script, only PT packages get enabled, even if you changed something in a non-PT package. Historically, they may not depend on any TPLs other than MPI (optionally), the BLAS, and LAPACK.

ST ("secondary tested") is a label for packages that claim to require TPLs other than MPI, the BLAS, or LAPACK, but which Trilinos nevertheless wants to test regularly. If you want to test ST packages in the check-in test script, you must use --st-extra-builds or --extra-builds to identify those packages explicitly.

I wrote "claim to" above, because the ST label does not actually check the packages' TPL dependencies. Packages get to decide whether they want to stay ST or even EX ("experimental"). However, a PT package may not depend on TPLs other than those three.

@jmgate
Copy link
Contributor

jmgate commented Nov 11, 2016

Thanks @mhoemmen. @rppawlo, when is the planned transition from ST to PT for Panzer?

@rppawlo
Copy link
Contributor Author

rppawlo commented Nov 11, 2016

not sure of the exact timeline. it is being tracked in #410 @bartlettroscoe has commented on it the past few days so it is being actively worked. @bartlettroscoe - do you know when you will be changing the PT code?

@rppawlo
Copy link
Contributor Author

rppawlo commented Nov 11, 2016

Also - in #410, PT is being expanded to include more TPLs than just MPI, BLAS and LAPACK.

@bartlettroscoe
Copy link
Member

I just need to test on a few platforms and then it is ready to push. I will ask for review before first. I will send out an email.

@rppawlo rppawlo removed the stage: ready The issue is ready to be worked in a Kanban-like process label Nov 19, 2016
rppawlo added a commit that referenced this issue Feb 23, 2017
@rppawlo rppawlo added the stage: in review Primary work is completed and now is just waiting for human review and/or test feedback label Feb 23, 2017
@rppawlo
Copy link
Contributor Author

rppawlo commented Feb 23, 2017

Should be fixed now. Moving to "In Review". Will watch the test over the next few days to see if issues clear up.

rppawlo added a commit that referenced this issue Feb 23, 2017
@rppawlo rppawlo closed this as completed May 3, 2017
@rppawlo rppawlo removed the stage: in review Primary work is completed and now is just waiting for human review and/or test feedback label May 3, 2017
bartlettroscoe added a commit that referenced this issue Jan 19, 2019
Origin repo remote tracking branch: 'github/master'
Origin repo remote repo URL: 'github = [email protected]:TriBITSPub/TriBITS.git'

At commit:

commit de606e5c5adfdff290df02c2e9137539300e9999
Author:  Roscoe A. Bartlett <[email protected]>
Date:    Sat Jan 19 12:10:37 2019 -0700
Summary: Change 'WARNING:' to 'NOTE:' for non-CMake warnings (#84, #271)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants