-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate computational instability in the datm_cdeps_mx025_cfsr test on gaea, orion and wcoss_dell_p3 #642
Conversation
Change-Id: Ie2103eab6d91105f9cd0e9bb069c2d95bef2b6b8
Change-Id: I7d848f4cd91bea74b04d8b43ab1f45a1ace8cda2
Change-Id: I92025fbae25785f84abf7fa1a08bfbca901bffa6
Change-Id: I85510b236d2ef25e09e6f4578ac6196b3a7e2de1
Change-Id: Ie09f7f4cf645bc6661bc49d73fec21e51edfb728
Change-Id: I137a0e798eb2273be61046d081fb379dc0aa0a7a
Change-Id: I9619f724e12e36607ea84a009134e9997d9bddf0
Change-Id: Ide8e962ffc1f1304c042bb79907789442180338f
* revise rt.conf * revise datm_cdeps_mx025_cfsr * revise datm_cdeps_mx025_gefs
@binli2337 @uturuncoglu @aerorahul I am wondering if we can have PR#611 and PR#615 in Bin's PR so that he can test the CDEPS in UFS, Also Ufuk, can you confirm the HAFS is working with PR#611 and PR#615 and the latest CDEPS code? |
@junwang-noaa |
@uturuncoglu Can you confirm if HAFS is working with PR #611 and #615? Once you confirm, we will combine the #611/#615 with this PR and plan to get it committed in the next two weeks. If we did not hear from you, we have to assume you need more time to test PR #611 and #615 within HAFS, we will not put the two PRs in the commit queue planned for the next two weeks. Please let us know. Thanks |
@junwang-noaa let me check. I am still trying to update HAFS and RTs for recent changes. I'll update you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@binli2337 @junwang-noaa I also run RT for HAFS standalone regional FV3 configuration and that also did not pass. So, it seems that the issue related with the FV3 not coupling and since the FV3 version is fixed (i am using same version with feature/hafs_couplehycom_cdeps branch under HAFS) and it was passing without any issue. I am suspecting a build time issue at this point. So, we need to investigate it further. Please let me know if you know anything that could lead to answer change such as change in compiler version etc.
@uturuncoglu When you say FV3 standalone RT did not pass here, did the build fail, the run fail to run, or just the forecast produce different results? If it is the last situation (producing different results), then that might not be a big concern here, since the fetaure/hafs_couplehycom_cdeps is somewhat (~2 months) behind the develop branch. |
@binli2337 , please replace 32BIT=Y with -D32BIT=ON in utest.bld
|
New baseline directories need to be created for the following 4 data atmosphere tests due to shorter forecast time and reduced ocean dynamics time step: New baseline directories also need to be created for the following tests on wcoss_phase3: After new baseline results are created on wcoss_phase3, the following 3 RT tests cannot reproduce the baseline results: |
@MinsukJi-NOAA It is done. Thanks! |
@uturuncoglu @junwang-noaa The following tests failed only on wcoss_phase3. The issue might be related to how netcdf library is installed on different platforms: |
@binli2337 HAFS stand-alone regional fv3 test runs but answer changes. |
@binli2337 I am not sure netcdf change could lead to answer change. I could compare the module files and try to run the case again with old modules to see what happens. |
@binli2337 @junwang-noaa i think having HAFS related RTS in UFS level needs to be priority to catch these issues quickly. Since model is enclosing too fast it is hard to test it with HAFS with this way. |
@uturuncoglu The plan is to merge the updates from this PR to the develop branch today. Please let @junwang-noaa and @DusanJovic-NOAA know if you have any questions or concerns. |
@binli2337 @junwang-noaa @DusanJovic-NOAA @BinLiu-NOAA My main concern is to have answer change in HAFS side without knowing the source of it. At this point, the decision is yours. If you still plain to merge it, that is fine but we need to create baselines in HAFS side again. @BinLiu-NOAA let me know what do you think? We were plaining to sync support/HAFS before Friday after all PRs merged. Are you fine with the possible answer change that is caused with this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor cosmetic changes.
I don't see a need to rerun the regression tests.
tests/tests/datm_cdeps_mx025_cfsr
Outdated
export RESTART_N=${FHMAX} | ||
export DT_DYNAM_MOM6='900' | ||
export DT_THERM_MOM6='1800' | ||
export CPL_SLOW=${DT_THERM_MOM6} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you've reset the variable CPL_SLOW
, you also need to reset the variable export coupling_interval_slow_sec=${CPL_SLOW}
in the test to have it take effect in nems.configure.datm.IN
, otherwise it the original (default=3600s) value will be used. This will impact all the mx025 tests where you changed the MOM timesteps.
I just tested this on Cheyenne in the datm_mx025_cfsr
test; the nems.configure for it shows:
runSeq::
@3600
MED med_phases_prep_ocn_avg
MED -> OCN :remapMethod=redist
OCN
@900
MED med_phases_prep_ice
MED -> ICE :remapMethod=redist
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DeniseWorthen The line "export CPL_SLOW=${DT_THERM_MOM6}" will be removed from the following 4 tests:
datm_mx025_cfsr
datm_mx025_gefs
datm_cdeps_mx025_cfsr
datm_cdeps_mx025_gefs
In above 4 tests,
coupling_interval_slow_sec=3600 seconds
coupling_interval_fast_sec=900 seconds
DT_DYNAM_MOM6=900 seconds (ocean dynamics time step)
DT_THERM_MOM6=1800 seconds (ocean thermodynamic time step)
After the changes, the 4 tests passed RT tests on wcoss_phase3. Thanks for discovering the issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe that is the correct solution. If the slow coupling only occurs every hour, then MOM6 will take two internal thermodynamic timesteps before it gets "updated" forcing from the atm/ice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better solution will be used in a future PR. An issue will be submitted. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks. I think I mis-understood your original comment then.
add RegressionTests_wcoss_dell_p3.log update 4 datm tests
Requested code changes have been made. See commit b58d781
* Update templates for new ufs weather model * Add option for pressure tendency diagnostic * Remove unnecessary namelist parameters * Modify model_configure * change nfhout to output_fh
PR Checklist
Ths PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.
This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR
An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
are specified below.
If new or updated input data is required by this PR, it is clearly stated in the text of the PR.
Instructions: All subsequent sections of text should be filled in as appropriate.
The information provided below allows the code managers to understand the changes relevant to this PR, whether those changes are in the ufs-weather-model repository or in a subcomponent repository. Ufs-weather-model code managers will use the information provided to add any applicable labels, assign reviewers and place it in the Commit Queue. Once the PR is in the Commit Queue, it is the PR owner's responsiblity to keep the PR up-to-date with the develop branch of ufs-weather-model.
Description
Eliminate computational instability in the datm_cdeps_mx025_cfsr test on Gaea, Orion and WCOSS_dell_p3. This PR also uses the latest develop branch of CDEPS and reduces the forecast time to 12 hours for the "datm_cdeps_mx025_cfsr" and "datm_cdeps_mx025_gefs" tests, so that the two tests can be done within 30 minutes.
Issue(s) addressed
Testing
How were these changes tested? What compilers / HPCs was it tested with? Are the changes covered by regression tests? (If not, why? Do new tests need to be added?) Have regression tests and unit tests (utests) been run? On which platforms and with which compilers? (Note that unit tests can only be run on tier-1 platforms)