Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert modulefiles to .lua format on WCOSS2, acorn, hera, jet, orion, cheyenne #1376

Closed

Conversation

BrianCurtis-NOAA
Copy link
Collaborator

PR Checklist

  • This PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR. Please consult the ufs-weather-model wiki if you are unsure how to do this.

  • This PR has been tested using a branch which is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR

  • An Issue describing the work contained in this PR has been created either in the subcomponent(s) or in the ufs-weather-model. The Issue should be created in the repository that is most relevant to the changes in contained in the PR. The Issue and the dependent sub-component PR
    are specified below.

  • Results for one or more of the regression tests change and the reasons for the changes are understood and explained below.

  • New or updated input data is required by this PR. If checked, please work with the code managers to update input data sets on all platforms.

Description

NCO prefers lua modules an UFS got a pass for WCOSS2. This will make WCOSS2, acorn, hera, jet, orion and Cheyenne follow .lua modules.

Issue(s) addressed

Testing

  • hera.intel
  • hera.gnu
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss2.intel
  • acorn.intel
  • opnReqTest for newly added/changed feature
  • CI

Dependencies

None

@BrianCurtis-NOAA BrianCurtis-NOAA added the No Baseline Change No Baseline Change label Aug 19, 2022
@climbfuji
Copy link
Collaborator

How is this PR supposed to work on Gaea? I see you deleted the ufs_common and ufs_common_debug Tcl/Tk environment modules but didn't make any updates to the Gaea module files.

Note that spack-stack will only support native Tcl/Tk env modules on Gaea in order to remove the nasty hack to use lua modules there (which has caused headaches for many people).

@DavidHuber-NOAA
Copy link
Collaborator

I can update/test the lua formatted module file on S4.

@DavidHuber-NOAA
Copy link
Collaborator

I am getting cp No such file or directory error when running rt.sh on S4. When compile.sh is executed, it attempts to copy

cp ${PATHTR}/modulefiles/ufs_${MACHINE_ID} ${PATHTR}/tests/modules.${BUILD_NAME}

However, the name of the new module files end with .lua, so the cp fails. See lines 132-136.

@BrianCurtis-NOAA
Copy link
Collaborator Author

I can update/test the lua formatted module file on S4.

@DavidHuber-NOAA I think I got all bugs squared away. Please try again.

@BrianCurtis-NOAA
Copy link
Collaborator Author

@DavidHuber-NOAA Now that I read back your comments today. Should I convert s4 to .lua format? I think I made an incorrect assumption that s4 should stay tcl.

@climbfuji
Copy link
Collaborator

@DavidHuber-NOAA Now that I read back your comments today. Should I convert s4 to .lua format? I think I made an incorrect assumption that s4 should stay tcl.

Note for the near future (hopefully!): we now have a spack-stack install on S4, which is using .lua modules. Maybe S4 can be the guinea pig for transitioning in the next several weeks?

@DavidHuber-NOAA
Copy link
Collaborator

@BrianCurtis-NOAA I can change that when I commit the S4 .lua module files, so that's not a problem. That said, there is a similar bug in run_test.sh. When run_test.sh attempts to copy over the modulefile copied by compile.sh to the tests/ directory, it needs a .lua extension for systems supporting lua modulefiles (see line 150).

@DomHeinzeller I would be good with giving spack-stack a whirl when the UFS is ready for it on S4.

@BrianCurtis-NOAA
Copy link
Collaborator Author

@DavidHuber-NOAA I did not see the same issue with run_test.sh on the other machines i tested on, and i'm not saying it doesn't exist on s4, but please update to the latest changes and let me know if they work for you.

@DavidHuber-NOAA
Copy link
Collaborator

@BrianCurtis-NOAA I gave this another go on S4 and received the same error. I also gave it a shot on Orion where I am also seeing the same error.

On Orion, I ran rt.sh as follows: rt.sh -c -e -n control_c48. The compile job executed without problem, but the control_c48 test failed on line 150 of run_test.sh. The last few lines of run_001_control_c48.log contain the following:

+ cp /work/noaa/nesdis-rdo2/dhuber/ufs_lua/tests/fv3_001.exe fv3.exe
+ cp /work/noaa/nesdis-rdo2/dhuber/ufs_lua/tests/modules.fv3_001 modules.fv3
cp: cannot stat '/work/noaa/nesdis-rdo2/dhuber/ufs_lua/tests/modules.fv3_001': No such file or directory
+ '[' 1 -eq 0 ']'
+ write_fail_test
+ [[ false == true ]]
+ echo 'control_c48 001 failed in run_test'
+ exit 1

The name of the modulefile in the tests/ directory is "modules.fv3_001.lua" rather than "modules.fv3_001", which I believe is the source of this error.

I ran rt.sh here: /work/noaa/nesdis-rdo2/dhuber/ufs_lua/tests

@BrianCurtis-NOAA
Copy link
Collaborator Author

@DavidHuber-NOAA Thanks! I don't normally run rt.sh -c -e -n control_c48 i usually run rt.sh -e -l rt.test where i just copy rt.conf to rt.test and cut a bunch of tests out. I will use that and get a fix.

@BrianCurtis-NOAA
Copy link
Collaborator Author

@DavidHuber-NOAA Thanks! I don't normally run rt.sh -c -e -n control_c48 i usually run rt.sh -e -l rt.test where i just copy rt.conf to rt.test and cut a bunch of tests out. I will use that and get a fix.

@DavidHuber-NOAA First off, thanks for helping test on S4. I have implemented a fix that i tested on Gaea/Orion successfully. Please test again on S4 if you can.

@DavidHuber-NOAA
Copy link
Collaborator

@BrianCurtis-NOAA Yes, it's working for me now on S4. Thanks! I'll open a PR into your fork shortly.

Convert S4 modulefiles to Lua
@BrianCurtis-NOAA
Copy link
Collaborator Author

@DavidHuber-NOAA It must be Friday. I had those changes in my Orion test and never pushed them to the repo. Apologies and thanks for the work! I think we'll get this into UFS ASAP.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Sep 2, 2022

@BrianCurtis-NOAA we can take time. we can combine this PR with #1383 next Tuesday.

@BrianCurtis-NOAA
Copy link
Collaborator Author

Rahul needs complete .lua change. Meaning we'll need to edit compile.sh and run_test.sh to reflect these changes.

@jkbk2004
Copy link
Collaborator

jkbk2004 commented Sep 6, 2022

@BrianCurtis-NOAA so we need to move on to #1383 without combining, right?

jkbk2004 pushed a commit that referenced this pull request Sep 9, 2022
* acorn and common .lua changes, hera.intel (not hera.gnu) .lua changes

* CMake_Platform --> CMAKE_Platform

* jet and wcoss2 .lua changes, cheyenne.gnu/intel lua updates, orion lua updates, typo in jet.lua files, Orion fixes for .lua files

* Set proper number of non-advected tracers in regional_atmaq regression test.

* Add proper convective scavenging coefficients to the regional_ataqm regression test.

* Update ufs_s4.intel.lua, Update ufs_s4.intel_debug.lua
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
No Baseline Change No Baseline Change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update modulefiles to Lua
5 participants