Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested changes to cam.case_setup.py #26

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 69 additions & 0 deletions cime_config/cam.case_setup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
#! /usr/bin/env python3

"""Copy GEOS-Chem configuration files from source to the case directory.
This script is run from CIME when calling case.setup"""

import logging
import os
import shutil
import sys

_CIMEROOT = os.environ.get("CIMEROOT")
if _CIMEROOT is None:
raise SystemExit("ERROR: must set CIMEROOT environment variable")
# end if
_LIBDIR = os.path.join(_CIMEROOT, "CIME", "Tools")
sys.path.append(_LIBDIR)
sys.path.insert(0, _CIMEROOT)

#pylint: disable=wrong-import-position
from CIME.case import Case
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this and the above to CIMEROOT all for getting case for the logger and cam options?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Most of CIME works off of a case object and this is the standard way to bootstrap one of those.


logger = logging.getLogger(__name__)

if len(sys.argv) != 3:
raise SystemExit(f"Incorrect call to {sys.argv[0]}, need CAM root and case root")
# end if
cam_root = sys.argv[1]
case_root = sys.argv[2]

with Case(case_root) as case:
cam_config = case.get_value('CAM_CONFIG_OPTS')
# Gather case information (from _build_usernl_files in case_setup.py)
comp_interface = case.get_value("COMP_INTERFACE")

if comp_interface == "nuopc":
ninst = case.get_value("NINST")
elif ninst == 1:
ninst = case.get_value("NINST_CAM")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain what ninst is and how it is different for nuopc versus non-nuopc?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CESM has the capability to run an ensemble of instances as a single run. This is most often used as part of a data assimilation (DA) cycle. The ensemble runs, say for 6 hours, then writes restart files and stops while DART updates the state. Next, the model restarts with the updated state. This entire process can run several times as a single job submission (longer runs are handled with automatic job resubmission).

When the multiple instance implementation was first created in MCT, one idea was to be able to have some components run a single instance while others ran several (for instance, a single ocean model responding to an ensemble of atmosphere simulations). However, no one ever figured out how to define this scientifically (coupling is hard) so the idea was officially dropped many years ago. The move to NUOPC just formalizes that situation.

The impact on GEOS-CHEM is deciding if it will support different configurations for different instances of CAM running as a single job. CAM will have atm_in_0001, atm_in_0002 etc. but that has an impact inside of CAM where it has to check its status as an instance and look for the correct namelist filename to open. I have no idea if this is a priority for the community that will be using GEOS-CHEM which is why they should decide if the (not insignificant) extra effort is worth it. If they decide they do not need full multi-instance support, I suggest it is either prevented (perhaps in this script) or it is just specified that all instances will use the same set of GEOS-CHEM configuration files.

Copy link

@lizziel lizziel Oct 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, okay. We decided not to enable multi-instance support, at least for now. However, if we can enable it with the constraint that all GEOS-Chem config files are identical then that would be a first step. There may be use cases for multi-instance with varying CAM settings that are external to the GEOS-Chem files. I see now this is what you implemented, correct?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite. I think you need to remove lines 56-59 (and 61).
Perhaps leave a comment saying that those files will be the same on all instances for a multi-instance case.

# end if
# end with

# GEOS-Chem only: copy config files to case
if '-chem geoschem' in cam_config:
geoschem_config_src = os.path.join(cam_root, 'src', 'chemistry',
'geoschem', 'geoschem_src', 'run', 'CESM')
if not os.path.isdir(geoschem_config_src):
raise SystemExit(f"ERROR: Did not find path to GEOS-Chem source code at {geoschem_config_src}")
# end if
for fileName in ['species_database.yml', 'geoschem_config.yml', 'HISTORY.rc',
'HEMCO_Config.rc', 'HEMCO_Diagn.rc']:
source_file = os.path.join(cam_root, geoschem_config_src, fileName)
if not os.path.exists(source_file):
raise SystemExit(f"ERROR: Did not find source file, {fileName}")
# end if
spaths = source_file.splitext(source_file)
for inst_num in range(ninst):
if ninst > 1:
target_file = f"{spaths[0]}_{inst_num+1:04d}{spaths[1]}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this part. Maybe your explanation of nuopc above will explain. If not, could you detail what is happening here that is special for nuopc?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this part. Maybe your explanation of nuopc above will explain. If not, could you detail what is happening here that is special for nuopc?

I think this suggested change is independent of MCT / NUOPC. It is also only relevant if GEOS-CHEM does decide to go ahead with support for multiple instance runs with independent GEOS-CHEM runtime configurations.
What this was intended to do is add an instance qualifier without disturbing the filename extension. So

species_database.yml ==> species_database_0001.yml

and

geoschem_config.yml ==> geoschem_config_0001.yml

and

HISTORY.rc ==> HISTORY_0001.rc

It is just a suggestion for how to create multiple instance files for a multi-instance run. I also have no idea if it works :)

else:
target_file = os.path.join(case_root, fileName)
# end if
if not os.path.exists(target_file):
logger.info("CAM namelist one-time copy of GEOS-Chem run directory files: source_file %s target_file %s ",
source_file, target_file)
shutil.copy(source_file, target_file)
# end if
# end for
# end for
# end if
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind if I leave out the end comments?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are part of the CAM Python coding standards so you would probably have to put them back when you get to the CAM SEs.
(note, there are also Fortran coding standards).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.