-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-implement updated auto-rts for NCAR fork #101
Conversation
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
Automated RT Failure Notification |
@grantfirl I thought I was ready to mark this PR ready for review, but after incorporating the latest changes I'm seeing a lot of failures, even after switching to the latest baselines on both Cheyenne and Hera. Are these all potentially related to the HDF5 problems you mentioned in #104? Or was that only on Cheyenne that it should have been a problem? |
Cheyenne should be the only one with the problem. The latest baselines aren't even there due to the problem. Hera should be fine. If you run the RTs manually with the latest NCAR main branches on Hera, do you still see problems? |
on-behalf-of NCAR @mkavulich
@grantfirl Turns out it was a false alarm, Hera tests did all pass! I am opening this PR for review. Let me know if you have any comments. |
DISKNM=$dprefix/ | ||
STMP=$dprefix/stmp4 | ||
PTMP=$dprefix/stmp2 | ||
RTPWD=${RTPWD:-/scratch1/BMC/gmtb/CCPP_regression_testing/NCAR_ufs-weather-model/baselines/main-${BL_DATE}/} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who has permissions in here? I'd like to be able to clear this out when necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment for all directories that we'll be writing to. Everyone tasked with running RTs should have write permissions in the directories in order to clear space.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks OK to me. I guess we can merge this and start using it to work out any bugs and then try a PR to the ufs-community fork? Also, please write some internal documentation for us (and anyone who comes after us) to use this (e.g. how to start the auto RT scripts on the machine, how to stop them, default paths). Also, please make sure that at least you and I have permissions everywhere the NCAR machine files point. One of the continual problems has been making sure that we have enough account storage space and RTs can take a lot!
@mkavulich Do you think that we should get some other approvals before merging? |
This PR implements updated automated regression testing for the NCAR fork of ufs-weather-model. It is still a work in progress, but when ready should fully re-implement the regression testing for the NCAR fork on Hera and Cheyenne for Intel and GNU compilers, with some additional features that add flexibility in how and where tests are run on each machine (i.e. replacing hard-coded paths with command-line arguments).
Note: some of these improvements come courtesy of @dustinswales's initial efforts to adapt this system to the NCAR fork.
Change details
In rt.sh
tests/machine/
. The default name of the sourced file is the same as the machine name, but can be over-written by command-line argument.In rt_auto.py and other python logic for auto-tests
New yaml config file
README.rt_auto.yaml
)Example rt_auto.yaml: