Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PGI answers for OM4 do not match across restart #257

Closed
nikizadehgfdl opened this issue Mar 9, 2016 · 4 comments
Closed

PGI answers for OM4 do not match across restart #257

nikizadehgfdl opened this issue Mar 9, 2016 · 4 comments

Comments

@nikizadehgfdl
Copy link
Contributor

MOM6 tag 2016.03.02 (the issue is probably much older)
When compiled with pgi15 on c3 the experiment OM4_SIS2_low_mixing3 does not reproduce when a restart is involved
(i.e., 2 days restart files do not match 1+1 days restart files) (irrespective of mask_table being used or not).

There is no such issue for the intel15 runs on c3.

I have not seen a non-repeatability issue (answer changes between dual runs) with pgi15 for this experiment (3 runs so far) . So this is most probably a restart issue (75%).

@nikizadehgfdl
Copy link
Contributor Author

When I set

#override DEBUG = True

the model came down with:

FATAL from PE 20: NaN detected: after ePBL Kd_ePBL

stdout:
/lustre/f1/Niki.Zadeh/ulm_201510_mom6_2016.03.02_0/OM4_SIS2_low_mixing3.2016.03.02/ncrc3.pgi15-prod/stdout/run/OM4_SIS2_low_mixing3.2016.03.02_1x0m2d_1152x1o2.o69628

@nikizadehgfdl
Copy link
Contributor Author

The restart issue also exists on c2 with pgi14

@nikizadehgfdl
Copy link
Contributor Author

The problem exists in the much smaller (32 pes) test case MOM6_SIS2_bergs_cgrid
PGI does not reproduce across a restart whereas Intel does.

Also, I think the issue exists only in prod mode (-O3) and goes away in repro mode (-O2).

adcroft added a commit that referenced this issue Mar 17, 2016
- Kd_ePBL has intent(out) from the PBL routine but was being
  initialized by the calling routine and conditionally on whether
  a diagnostic is active:
  - Kd_ePBL is a mandatory output, no-longer conditional on the
    diagnostic;
  - initializing values by the caller is not guaranteed to work when the
    callee has intent(out);
  - ePBL was only setting values for non-land points.
- The conditional initialization has been removed entirely.
- ePBL now sets values on all computational points.
- Addresses issue discussed in #257 where DEBUG=True was detecting
  NaN's in Kd_ePBL.
@adcroft
Copy link
Collaborator

adcroft commented Mar 17, 2016

DEBUG=True issue has been fixed in b4f8cdb
NOAA-GFDL@b4f8cdb
.

-A.

Dr Alistair Adcroft ([email protected])
Princeton University Tel: (609) 987-5073
NOAA/GFDL, 201 Forrestal Road, Princeton, NJ 08540

On Wed, Mar 16, 2016 at 4:38 PM, Niki Zadeh [email protected]
wrote:

The problem exists in the much smaller (32 pes) test case
MOM6_SIS2_bergs_cgrid
PGI does not reproduce across a restart whereas Intel does.

Also, I think the issue exists only in prod mode (-O3) and goes away in
repro mode (-O2).


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub
NOAA-GFDL#257 (comment)

gustavo-marques pushed a commit to gustavo-marques/MOM6 that referenced this issue Sep 13, 2023
* Move mct_cap/ to STALE_mct_cap/. mct cap is no longer supported and will soon be removed for good.

* remove mct from CI testing

* Remove mct test from github workflows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants