-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[develop] Update hashes of GSI and rrfs_utils #800
[develop] Update hashes of GSI and rrfs_utils #800
Conversation
@BenjaminBlake-NOAA @christinaholtNOAA I'm willing to assist with testing to make sure that the updated hashes of the GSI and RRFS_UTILS build on the various machines, but should we move forward with updating the GSI at this point? With the Intel 2022 run issues still not addressed (will be addressed with PR #571) and issues with HDF5 versions other than 1.10.6, should we wait until these issues have been resolved, or move forward. I'm fine with updating the hashes now, but they will likely need to be updated again before the SRW v3.0.0 release. |
The updated hash earlier rather than later will allow us to identify any other potential issues that may arise in the WE2E testing process as we add those workflow capabilities. |
@MichaelLueken I agree with @christinaholtNOAA that it would be beneficial to update the GSI sooner if possible - thanks. |
I am working on building the branch on Jet. |
Attempted to build the updated GSI and rrfs_utils on Cheyenne. It ultimately failed because ncdiag/1.0.0 isn't available in the hpc-stack on the machine for either Intel or GNU compilers. I have notified @natalie-perlin about the missing ncdiag/1.0.0 on the machine and to please add this at her earliest convenience. The GSI and rrfs_utils built without issue on Orion and Gaea (using the updated modulefile from PR #799). |
@MichaelLueken Sounds good! Once ncdiag is added to Cheyenne, it should work. |
@BenjaminBlake-NOAA I fear I spoke too soon on Gaea. It looks like it is failing during the linking step to build the GSI executable due to undefined references to I'll also reach out to @natalie-perlin to see if this is something that she can correct as well. |
@mark-a-potts - We can't move to spack-stack until the ufs-weather-model has done so. In this case, we would need add openblas to hpc-stack. Would this need to be for both Hera GNU and Cheyenne GNU, or just Hera GNU? |
Anyone know if there are regression tests for rrfs_utils or will we just need to test with some SRW E2E tests? |
There is no testing framework for rrfs_utils. I think it's typically tested in the RRFS_dev1 retros as many of the other executable changes are. |
I think that openblas needs to be added to the modulefile for build_hera_gnu.lua. With that addition, the devbuild.sh script works with gsi on Hera. |
The rrfs_utils PR is ready for review, but I am unsure how to go about testing it. |
The RRFS Utils PR was just merged. The hash is now 5681d1c. @BenjaminBlake-NOAA would you be willing to bump to that version in this PR? |
@christinaholtNOAA Yes I can do that. I will test the build on WCOSS2 and Hera then push the change to my feature branch. |
@mark-a-potts @BenjaminBlake-NOAA On Cheyenne GNU, rrfs_utils is still failing with the following:
The same issue is encountered on Hera GNU:
|
So, it looks like those commands (actually that whole file) got added to rrfs_utils after I put in my PR, so they didn't get caught as an issue with gnu. The solution is to change them from "stop(333)" to "stop 333" in adjust_soiltq/module_bkio_fv3lam_parall.f90. I have verified that this compiles with both Intel and gnu compilers on Hera. I can put in another PR to rrfs_utils to address this issue. |
@mark-a-potts If you have verified that it builds successfully on GNU compilers, please go ahead and create the PR to update the new file. Thanks! |
The new PR is here-- NOAA-GSL/rrfs_utl#31 |
The rrfs_utils PR has been merged, so that should now compile with GNU. I believe that an openblas module will need to be added to the GNU modulefiles (on Hera, at least) in order to get the GSI to compile, however. |
@mark-a-potts @BenjaminBlake-NOAA The new rrfs_utls hash - 6cacff5, was successfully built on both Hera and Cheyenne. The GSI also built without issue on both systems. @mark-a-potts Would we see issues related with
Please note that these messages are seen with and without openblas loaded during code compilation. The test can be ran using:
in the tests/WE2E directory. |
@MichaelLueken - So, the Intel MKL libraries are linked when building |
@MichaelLueken, when I had built the SRW with "./devbuild.sh --platform=hera --compiler=gnu all" earlier on Hera, I had compile errors in GSI due to missing Lapack/Blas libraries. I'll test that again, though. |
@mark-a-potts I compiled without issue using:
I'll test with |
@christopherwharrop-noaa I don't know about the rrfs_utl repository, but I know that the GSI is set up to always load the MKL libraries. If the MKL libraries aren't found, then the GSI will load LAPACK in it's place. It looks like the GSI build will need to change to allow OpenBLAS for GNU and MKL for Intel. |
@MichaelLueken @mark-a-potts @christopherwharrop-noaa -
Working on regression tests on Hera/GNU (some pass) at the moment, and on other modulefiles and test builds for the PR: |
@natalie-perlin The SRW App doesn't use the component modulefiles to build the separate components. All components are built using |
@MichaelLueken - |
@@ -27,5 +27,6 @@ load("sigio/2.3.2") | |||
load("w3nco/2.4.1") | |||
load("wrf_io/1.2.0") | |||
|
|||
load("ncdiag/1.0.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recent updates to GSI require ncdiag/1.1.0 (added to all hpc-stacks on Hera gnu/intel, Jet, Orion, Cheyenne gnu/intel, Gaea
on June 2, 2023)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BenjaminBlake-NOAA Following the latest modifications to this PR, the GSI and rrfs_utils now compile using both Intel and GNU compilers on Hera and Cheyenne and the process_obs
WE2E test runs using both Intel and GNU executables on Hera.
Thank you very much @mark-a-potts and @natalie-perlin creating the PRs in the rrfs_utl repo to correct the GNU build issues and adding OpenBLAS to the HPC-Stack!
Approving this work now.
@BenjaminBlake-NOAA The Jenkins tests failed on: Cheyenne GNU - the The Jenkins tests can't run on Gaea because the SRW App can't be built on the machine at this time. Hera Intel - the Orion - Manually ran the Jenkins tests and all tests successfully passed. Merging this PR now. |
DESCRIPTION OF CHANGES:
This PR updates the hashes of the GSI and rrfs_utils in Externals.cfg to the versions currently used by the real-time RRFS runs. The GSI version is #4afe6ed (committed on March 23) and the rrfs_utils version is #4ec1a33 (committed on April 14). To compile the newer version of GSI, ncdiag and ncio must be loaded to build the code. I have made the changes to get it working on WCOSS2 and Hera, but will need assistance on the other supported platforms.
Type of change
TESTS CONDUCTED:
DEPENDENCIES:
None.
DOCUMENTATION:
No updates needed for this PR.
ISSUE:
Fixes issue #548
CHECKLIST
LABELS (optional):
A Code Manager needs to add the following labels to this PR:
CONTRIBUTORS (optional):
@christinaholtNOAA has volunteered to test on Jet