Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev gfdl main candidate 2021 10 04 #1507

Merged
merged 273 commits into from
Oct 20, 2021
Merged

Conversation

marshallward
Copy link
Collaborator

@marshallward marshallward commented Oct 4, 2021

This PR includes the following major changes:

  • Introduction of a generalized vertical reference depth, G%Z_ref.
  • Many bug fixes for sponges, OBCs, FMS2 and other issues. See the list below for details.
  • Improvements to several existing components, including ODA (via SPEAR), ice shelves, and internal waves.
  • CPT contributions, including new diagnostics and changes to MEKE damping
  • Several updates to the documnetation.
  • Profiling been added to the CI and test suite.
  • Restart registration can now be locked after initialization to avoid unexpected changes.
  • Rotational testing can now include input fields from read_data.

Review


Features

Bugfixes

Refactor

Support

Contributors:

gustavo-marques and others added 30 commits May 27, 2021 13:46
This is required for the u,v sponges to be invariant of tiling.
I don't know why, but the problem only showed up for me in a narrow
channel in the Bering domain.
The MOM_control_struct is declared and passed as a pointer to a CS
(usually allocated anonymously) and treated as if it were a pointer,
even though there is currently no real advantage to doing so.

After the FMS update, the deallocation of this CS was causing a
segmentation fault in the PGI compilers.  While the underlying cause was
never determined, it is likely due to some automated deallocation of the
CS contents, whose addressing became scrambled.

This problem can be resolved by moving all of the CS contents to stack,
so that the contents are automatically removed upon exiting whatever
function it was instantiated.  Subsequent calls can reference the local
(or parent) stack contents.
The MOM_control_struct is declared and passed as a pointer to a CS
(usually allocated anonymously) and treated as if it were a pointer,
even though there is currently no real advantage to doing so.

After the FMS update, the deallocation of this CS was causing a
segmentation fault in the PGI compilers.  While the underlying cause was
never determined, it is likely due to some automated deallocation of the
CS contents, whose addressing became scrambled.

This problem can be resolved by moving all of the CS contents to stack,
so that the contents are automatically removed upon exiting whatever
function it was instantiated.  Subsequent calls can reference the local
(or parent) stack contents.
This patch introduces two new testing targets to the verification suite
based on a small configuration based on the `benchmark` regression test.

The profile test is saved a `p0` in `.testing`.  Future tests can be
included if appropriate.

The new targets:

* `make profile`: Run the model and record the FMS timings.

* `make perf`: Run the model through the `perf` tool and record timings
  for the resolvable functions (as symbols).

In both cases, the timings are converted to JSON output files and the
top results are reported to stdout, and readable in GitHub actions
output.  It can also be run locally.

Support Python scripts have been included to do this work.  This will
require a functional Python environment.

Some system and configuration data is logged alongside the timings, but
this is still rather incomplete and needs some further planning.

Times are compared to the target build (usually dev/gfdl).  ANSI
terminal highlighting (i.e. color) is to used to highlight excessive
differences.

Current issues:

- Model configuration

- GitHub timings are still rather unreliable, and should currently only
  be treated as crude estimates.  This should be considered a work in
  progress.

- The GitHub profiling rule still builds the standard configuration,
  evem though it is unused.

- Additional tools are required to push the timings to some database,
  either a local sqlite3 or an external one.
- Without this change, the edges don't reproduce on restart
  due to the h values outside being nonsense.
  Added new diagnostics of the acceleration driven by the wind stress
accelerations as redistributed by a timestep of the vertical viscosity and not
lost to bottom drag within a timestep.  This is also in the diagnostics of the
accelerations due to the vertical viscosity, but the redistribution can be found
from the difference of the two.  Also added a diagnostic of the contribution of
the wind stresses to kinetic energy, and applied an underflow limiter on both
the new acceleration diagnostic and the existing viscous acceleration
diagnostic.  All solutions are bitwise identical, but there are new diagnostics.
  Add checks for inconsistent parameter settings in adiabatic_driver_init() when
ADIABATIC = True, and issue instructive error messages if any are found.  This
PR addresses MOM6 issue #1417.  All answers are bitwise identical, although some
cases where the inconsistent parameter settings were previously ignored may now
issue fatal errors and will not run.
  Added code to lock the restart registry once all registration should have
occurred or if the restart has been read, along with a new public interface,
restart_registry_lock, to allow this lock to be set or unset.  All calls to
register restart fields now check the state of this lock and issue a fatal error
if the registry is locked.  This PR addresses MOM6 issue #1214.

  In the process of adding this restart lock, the new error messages revealed
that some of the restart registration calls related to some types of open
boundary conditions were not happening early enough.  To avoid this, a new
interface, register_DOME_OBC, was added to the DOME_initialization module and is
being called from call_OBC_register, and a number of the OBC-related calls
during the initialization were collected in the same (appropriate) place.  Some
OBC error messages were also corrected.  All answers are bitwise identical, but
there are two new public interfaces and the order of some OBC-related entries in
the MOM_parameter_doc calls changed.
- This addresses the FMS issue $761
NOAA-GFDL/FMS#761

- There is a mpp_broadcast in the FMS2 subroutine
get_unlimited_dimension_name() and this subroutine has to be called by
all pes, so it cannot be inside a if(is_root_pe()) block
  Added a test to avoid attempting to deallocate the geothermal heating field if
it is not allocated, and changed the geo_heat element of geothermal_CS from a
pointer into an allocatable array.  Also clarified the comments describing
several of the elements of geothermal_CS, and added a test to avoid logging the
value of GEOTHERMAL_DRHO_DT_INPLACE when the model is not in layered-mode and
this parameter is unused.  This PR addresses MOM6 issue #1449.  All answers are
bitwise identical in all cases that worked before, but there are fewer entries
in some ALE-mode MOM_parameter_doc files.
@alperaltuntas
Copy link
Collaborator

Approved. All CESM tests pass with expected answer changes due to MEKE updates (6979c29)

@jiandewang
Copy link
Collaborator

works fine in UFS except a new variable "First_direction" is being added in restart file which made cmp comparison between UFS current baseline results and restart files from this updated code failed. Except this new variable, all other common variables remain the same.

@marshallward
Copy link
Collaborator Author

@jiandewang Is the cmp difference an issue for your validation? Or can we consider this approval? I am assuming that you can still run your old experiments with the new restart variable.

@jiandewang
Copy link
Collaborator

@marshallward no this won't be an issue for UFS, all we need to do is re-generating UFS baseline.

@@ -260,7 +267,8 @@ subroutine call_tracer_register(HI, GV, US, param_file, CS, tr_Reg, restart_CS)
if (CS%use_dyed_obc_tracer) CS%use_dyed_obc_tracer = &
register_dyed_obc_tracer(HI, GV, param_file, CS%dyed_obc_tracer_CSp, &
tr_Reg, restart_CS)

if (CS%use_nw2_tracers) CS%use_ideal_age = &
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this line correct? I suspect the delta should be

+   if (CS%use_nw2_tracers) CS%use_nw2_tracers =

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fortunately, all this shoudl do is turn on ideal_age in addition to NW2, without breaking the NW2 tracers. Good catch - thanks. Currently testing a patch...

@abozec
Copy link
Collaborator

abozec commented Oct 8, 2021

COAPS approves this PR

@@ -88,6 +90,7 @@ module MOM_tracer_flow_control
logical :: use_pseudo_salt_tracer = .false. !< If true, use the psuedo_salt tracer package
logical :: use_boundary_impulse_tracer = .false. !< If true, use the boundary impulse tracer package
logical :: use_dyed_obc_tracer = .false. !< If true, use the dyed OBC tracer package
logical :: use_nw2_tracers = .false. !< If true, use the ideal age tracer package
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on this line should probably be referring to the nw2 (or NeverWorld2) tracer package, not ideal age.

Copy link
Collaborator

@adcroft adcroft Oct 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! This and the comment above both are due to copying the ideal tracer module as a starting point. Good catch - we'll clean it up.

@marshallward
Copy link
Collaborator Author

A documentation update (#1513) was provided by @klindsay28 ; this should not require extra review.

A reproducibility issue was discovered by @adcroft related to topography (#1428). It may be related to the unresolved issue which arose after that PR was merged. Alistair is investigating this, and we will resolve this issue before this PR is accepted to main.

adcroft and others added 4 commits October 11, 2021 21:57
PRs #1428 and #1457 extended the topography clipping to allow flooding
but missed the use case for positive-only depths where the MASKING_DEPTH
parameter alone was in use. There were two bugs:

1. The new code assumed that MINIMUM_DEPTH would be deeper than
   MASKING_DEPTH (which is intuitive). However, the point of MASKING_DEPTH
   was only to specify the determination of the land mask. The new code
   assigned depths the value of MASKING_DEPTH which broke cases that were
   using MASKING_DEPTH as documented and were leaving MINIMUM_DEPTH=0.

2. The values of variable masking_depth were altered and subsequently
   not consistent with the logged parameters. A warning was issued but
   the behavior was nevertheless not as intended.

Changes:
1. Removed the test that masking_depth > min_depth, and warning
2. Adjusted the condition and assigned value when clipping depths.
   This now uses the shallower of min_depth and masking_depth to decide
   when to clip and for the value to use otherwise.
   The expression for the land mask is unaltered.
3. Corrected documentation to retain original purpose of MASKING_DEPTH
4. Added some comments for declaration with units
5. Added some clarifying comments in code

Todo:
- resolve the need for the alternative negative depth pathway associated
  with the 0.5*min_depth expression.
- @klindsay28 spotted two issues for the NW2 tracers
  1. Use of the wrong logical variable
  2. Incorrect comment
- Using the wrong logical meant that the ideal age pacakge was being
  called in addition to the NW2 package, but did not affect the NW2
  tracers themselves.
- Following feedback from @herrwang0, we have removed the possibility
  for a user to try using negative depths without the MASKING_DEPTH
  parameter being set appropriately. This avoids the asymmetric use
  of MINIMUM_DEPTH that was proposed. A FATAL is now issued.
- Corrected a spelling error in a comment.
- Removed an unused "use" that should have been done in previous commit.
Recover topography clipping when not specifying MINIMUM_DEPTH
Copy link
Collaborator

@sanAkel sanAkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All seems okay with this PR; tested including latest commits.

Correct logical associated with NW2 tracers
@marshallward
Copy link
Collaborator Author

There has been one major update to the PR related to topography initialization, as well as two minor updates, so I think it would be best if everyone review the changes again.

@jiandewang
Copy link
Collaborator

I re-ran UFS with Marshall's latest commit NOAA-GFDL@ecf88c6, no change compared with the original PR code

@kshedstrom
Copy link
Collaborator

I approve this PR.

@sanAkel
Copy link
Collaborator

sanAkel commented Oct 14, 2021

Same here, all okay!

@abozec
Copy link
Collaborator

abozec commented Oct 15, 2021

Still good for COAPS.

@alperaltuntas
Copy link
Collaborator

approved again. all CESM tests passing with answer changes.

@marshallward
Copy link
Collaborator Author

All approved, thanks to everyone for their work.

@marshallward marshallward merged commit 8d80d64 into main Oct 20, 2021
gustavo-marques added a commit to NCAR/MOM6 that referenced this pull request Oct 25, 2021
@adcroft adcroft deleted the dev-gfdl-main-candidate-2021-10-04 branch November 16, 2021 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.