-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EAMxx: simplify branch runs and adding new output streams #7063
Conversation
@@ -217,6 +217,7 @@ class AtmosphereDriver | |||
util::TimeStamp m_run_t0; | |||
util::TimeStamp m_case_t0; | |||
RunType m_run_type; | |||
bool m_branch_run = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about adding a new value to RunType, but it required a few more changes across the library. Maybe that's more appropriate though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone back and forth on this in my mind. I think this is probably okay as it is. I don't forsee a reason to use RunType=branch as an option outside of this application. Just to confirm, for the rest of the code RunType
will be Restart
for branch runs right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
|
Confirming this works. Cherry-picking these commits into a branch from December allowed a CIME style branch run to work without having to add "Perform Restart: false" to all the I/O yaml files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question, otherwise looks good.
@@ -217,6 +217,7 @@ class AtmosphereDriver | |||
util::TimeStamp m_run_t0; | |||
util::TimeStamp m_case_t0; | |||
RunType m_run_type; | |||
bool m_branch_run = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've gone back and forth on this in my mind. I think this is probably okay as it is. I don't forsee a reason to use RunType=branch as an option outside of this application. Just to confirm, for the rest of the code RunType
will be Restart
for branch runs right?
So in essence, you're treating it a sub-category of a Restart run. That's fine, since it makes sense conceptually, but I would unify the lingo to use the one people are used to and documented here https://github.com/E3SM-Project/E3SM/blob/2b545d3f6a458dc539ce6e4120906705387ae8a8/run_e3sm.template.sh#L36C1-L38C33 relevant snippets from the run script
# Run options
readonly MODEL_START_TYPE="initial" # 'initial', 'continue', 'branch', 'hybrid'
readonly START_DATE="0001-01-01"
# Additional options for 'branch' and 'hybrid'
readonly GET_REFCASE=TRUE
#readonly RUN_REFDIR=""
#readonly RUN_REFCASE=""
#readonly RUN_REFDATE="" # same as MODEL_START_DATE for 'branch', can be different for 'hybrid' # Run type
# Start from default of user-specified initial conditions
if [ "${MODEL_START_TYPE,,}" == "initial" ]; then
./xmlchange RUN_TYPE="startup"
./xmlchange CONTINUE_RUN="FALSE"
# Continue existing run
elif [ "${MODEL_START_TYPE,,}" == "continue" ]; then
./xmlchange CONTINUE_RUN="TRUE"
elif [ "${MODEL_START_TYPE,,}" == "branch" ] || [ "${MODEL_START_TYPE,,}" == "hybrid" ]; then
./xmlchange RUN_TYPE=${MODEL_START_TYPE,,}
./xmlchange GET_REFCASE=${GET_REFCASE}
./xmlchange RUN_REFDIR=${RUN_REFDIR}
./xmlchange RUN_REFCASE=${RUN_REFCASE}
./xmlchange RUN_REFDATE=${RUN_REFDATE}
echo 'Warning: $MODEL_START_TYPE = '${MODEL_START_TYPE}
echo '$RUN_REFDIR = '${RUN_REFDIR}
echo '$RUN_REFCASE = '${RUN_REFCASE}
echo '$RUN_REFDATE = '${START_DATE}
else
echo 'ERROR: $MODEL_START_TYPE = '${MODEL_START_TYPE}' is unrecognized. Exiting.'
exit 380
fi
graph TD;
EAMxx-->Initial;
EAMxx-->Restart;
Initial-->startup
Restart-->continue;
Restart-->branch;
Restart-->hybrid;
In other words, I would turn Also note the outermost leaves are what we get from the CIME Additionally, for hybrid and branch, do we explicitly support REFDIR, REFCASE, REFDATE? These options facilitate the "staging" of the necessary files by CIME to make sure the run can continue in a dir, etc. --- my guess is that this should be automatically supported as CIME will take care of it if the user sets these options, but we should test it to be super sure. Copying @rljacob, does the above conceptual diagram sound right to you or do you think it's misleading? from the reference page on run type, esp on diff between hybrid and branch...
https://www2.cesm.ucar.edu/models/cesm1.2/cesm/doc/modelnl/env_run.html#run_start
|
On that note, I wouldn't give people an option in IO on what to do tbh. If a user selects branch/hybrid run, they are responsible for understanding what that entails for IO. So, I would impl all of this like follows:
The user only gets to choose if they want initial, continue, branch, hybrid (maybe we can error out early on hybrid for now), and the IO decisions are bound by that choice, not more optionality ... |
That is the idea, no? After we merge this, the user picks the run type via CIME vars, and IO will decide whether to look for restarts or not based on that. Unless I'm missing something with our restart behavior? |
Explicitly and automatically avoid the restart of any output stream
2460f47
to
6a70f62
Compare
You're introducing this new runtime param: |
That's a power user param. It is for users that want to add a new stream upon restart, without wiping the other streams (i.e., without doing a branch run). It is not needed (in fact, not used) for branch runs. |
That's kind of my point... this option shouldn't exist. If someone wants to modify a simulation this way, then they should branch out. That's one of the reasons branch runs exist! |
That works with me. In general, I'm fine keeping some backdoors for power users. But I'm also fine not having to maintain backdoors that nobody uses... |
No need to adjust it now, just giving my 2c on how I would envision these things from a user perspective. THis PR should be integrated whenever you see fit. My most important comment is this #7063 (comment) (see if you disagree with it or if you can make eamxx as close as possible to how users will expect the code to behave, as they're used to stuff in CIME) |
Simplify workflow for branch runs, and, in general, runs where new output streams are added upon restart. [BFB]
Simplify workflow for branch runs, and, in general, runs where new output streams are added upon restart.
[BFB]
Right now, for branch runs, the user needs to add the entry
Perform Restart: false
entry in theRestart
sublist of their output yaml file. This allows EAMxx to skip the look for the rhist file. However, the user must later remove this entry, so that upon subsequent restarts the stream IS restarted. This is confusing, and may be overlooked.This PR automatically handles this kind of details. In particular:
Perform Restart: false
withskip_restart_if_rhist_not_found: true
. This makes it simpler, since this param does not need to be removed after the 1st "post-restart-run"