Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update modulefile for hera.gnu #147

Closed
wants to merge 1 commit into from

Conversation

climbfuji
Copy link
Collaborator

The GNU software stack on Hera doesn't work with the ufs-weather-model. When compiled with the openmpi/3.1.4 module the model hangs indefinitely after initialization.

This PR updates the hera.gnu modulefile to use an mpich-3.3.2 build (and subsequent netCDF, ESMF and NCEPLIBS builds) that I created under the shared BMC/gmtb space.

The model runs are relativelty slow, presumably because I built mpich-3.3.2 without any tailoring to the Hera Infiniband fabric, but at least it runs. A C96 regression test finishes in about 10 minutes.

Note that I had to test the updated modulefile with an earlier version of the ufs-weather-model code, because the current code does not compile with GNU (FV3 dycore updates).

@MinsukJi-NOAA
Copy link
Contributor

I wonder if mvapich 2/2.3 could make it faster -- it says it is compatible with mpich 3.2 (https://mvapich.cse.ohio-state.edu/downloads/)

@climbfuji
Copy link
Collaborator Author

I wonder if mvapich 2/2.3 could make it faster -- it says it is compatible with mpich 3.2 (https://mvapich.cse.ohio-state.edu/downloads/)

From my experience, the differences are marginal. It depends mostly on how you configure the MPI library. I also do not intend to maintain that MPI library build on the long term, this is something the sysadmins should do. Compiling an MPI library with the correct settings for performance requires a little more than compiling a netCDF library. You need to know some of the details of the underlying fabric. It's already tricky to build it with the slurm integration (which I did).

Copy link
Collaborator

@junwang-noaa junwang-noaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the fix was tested with the latest model.

@climbfuji
Copy link
Collaborator Author

I assume the fix was tested with the latest model.

The change is only for a gnu modulefile that isn't used by the code when compiling on another machine or with Intel on hera, regression tests are not impacted at all. As described above in the description of the PR (#147 (comment)) I had to copy the updated modulefile from this commit into a previous version of the code (before FV3 dycore commit) in order to test compiling/running the regression tests on hera.gnu (because the FV3 dycore commit breaks the GNU build).

@climbfuji
Copy link
Collaborator Author

This PR was merged as part of #151

@climbfuji climbfuji closed this Jun 29, 2020
XiaSun-Atmos pushed a commit to XiaSun-Atmos/ufs-weather-model that referenced this pull request Aug 8, 2020
…0/07/21) (ufs-community#147)

- remove `include ./depend` from several GNU makefiles (from @DusanJovic-NOAA)
- correct CCPP version number in several suite definition files
- GFS_typedefs.F90: allow using `iopt_snf == 4` for other microphysics schemes than GFDL MP (required for RRFS, needs NoahMP and Thompson)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants