Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Managing om3-peripheral tools #262

Open
dougiesquire opened this issue Feb 3, 2025 · 17 comments
Open

Managing om3-peripheral tools #262

dougiesquire opened this issue Feb 3, 2025 · 17 comments

Comments

@dougiesquire
Copy link
Collaborator

dougiesquire commented Feb 3, 2025

We have an ever-growing set of tools that support setting-up/configuring/running ACCESS-OM3. As some tools are now starting to depend on others, we possibly need to revisit how they, and their inter-dependencies, are managed and maintained.

This issues is to compile a list of OM3-peripheral tools and discuss.

Tools

(please add)

@anton-seaice
Copy link
Contributor

We won't need esmgrids for OM3 post using mom supergrid in cice ACCESS-NRI/CICE#8

I think its worth splitting the list into python tools used routinely when running payu (om3-scripts, make_diag_table, expt_manager) vs stuff used on a more sporadic basis for developing new configurations (e.g. initial conditions and grids).

@anton-seaice
Copy link
Contributor

https://github.com/COSIMA/regional-mom6 is tangentially related / maybe needs considering in this list

@dougiesquire
Copy link
Collaborator Author

https://github.com/COSIMA/regional-mom6 is tangentially related / maybe needs considering in this list

It's already packaged and available via PyPI and conda-forge. It's the random, interdependent (mostly Python) scripts and modules that keep me awake at night.

@minghangli-uni
Copy link
Contributor

minghangli-uni commented Feb 14, 2025

COSIMA/om3-scripts#39 this might overlap with the profiling features in https://github.com/COSIMA/om3-utils, but it provides a simpler and significantly faster approach to handling profiling files. Probably can be added to the above list?

@dougiesquire
Copy link
Collaborator Author

dougiesquire commented Feb 14, 2025

An initial proposal to start discussions is that we split om3-scripts and om3-utils into three Python packages, all in the ACCESS-NRI Github org. Note here I'm just using placeholders for the names of the packages and modules:

Package 1: tools used for creating and manipulating OM inputs
Includes bits of om3-scripts, om3-utils and other scripts around the place

access-om-tools/
├── scripts
│   └── <"dumping ground" for scripts: includes most scripts from om3-scripts, make_diag_table>
├── utils (includes input file parsers from om3-utils plus stuff from om3-scripts/scripts_common)
└── ...

Package 2: tools used for profiling ACCESS models
Includes profiling modules from om3-utils

access-profiling/
├── esmf
│   └── ...
├── fms
│   └── ...
├── utils
└── ...

Package 3(/4): tool(s) for running suits of experiments with Payu
This one is still pretty uncertain as the tool is still being developed/refactored
However, the initial implementation should be moved out of om3-scripts and into it's own package(s)

All these packages should be made available via PyPI/conda and have the usual tests, docs, CI etc

@micaeljtoliveira
Copy link
Contributor

@dougiesquire Thanks for kick-starting the discussion and for the initial proposal.

Regarding package 1, note that the software transformation team needs to have parsers to all relevant files for all models, not only OM3, as we will be profiling and performing scaling tests for all models. As such, I would prefer to split it further and put all the parsing of any relevant configuration files used in the ACCESS models in its own package. That would maximize re-usability, remove duplicated code, and follow a "separation of concerns" approach.

As for package 3, I would again separate things further. A lot of what these tools do follow similar patterns and they could be written in terms of generic workflows. For example, many workflows that we have look like this: given a list of values, modify a parameter in the configuration files, run the model with payu and finally do something with the output. Such generic workflows could then be used as building blocks for both the experiment manager and the scalability/profiling tools.

All these packages should be made available via PyPI/conda and have the usual tests, docs, CI etc

Yes, absolutely.

@micaeljtoliveira
Copy link
Contributor

COSIMA/om3-scripts#39 this might overlap with the profiling features in https://github.com/COSIMA/om3-utils, but it provides a simpler and significantly faster approach to handling profiling files.

The ESMF text-based profiling output is just another type of profiling data and can easily be added to the other existing profiling data parsers that are in om3-utils (or whatever new package those things will be moved to).

@aekiss
Copy link
Contributor

aekiss commented Feb 14, 2025

run_summary and nmltab also include parsers, but they are more end-user-centric so I don't seen an advantage to bundling them up together with everything else that happens to have a parser.

I suppose it could work if their parsing code could be factored out into a general parser package that then becomes a dependency of run_summary and nmltab. But that seems a bit like make-work unless there's a real prospect of code reuse.

@dougiesquire
Copy link
Collaborator Author

dougiesquire commented Feb 18, 2025

We had a meeting about this. My summary as follows, but obviously please feel free to edit/comment if I've got anything wrong.

There is agreement that many small packages with well-defined scope is far better than fewer larger packages. Package scope should take into account functionality, but also who the users are likely to be. We spitballed at least 6 packages/repos:

Proposed name Scope Users
access-parsers Tools for reading/writing model configuration inputs (referred to as "parsers" in the meeting) ACCESS model devs, software transformation team, Payu, experiment manager tool...
access-om-scripts Dumping ground for general OM scripts for generating input files ACCESS-OM devs
access-profiling Tools for profiling ACCESS models ACCESS model devs, software transformation team
make_diag_table Tool for writing MOM diag_table from a yaml source ACCESS model devs, research community
Experiment manager: generator Tool to generate suites of Payu configurations (there may be multiple such tools) ACCESS model devs, research community
Experiment manager: runner Tool to run suites of Payu configurations ACCESS model devs, research community

First cab off the rank is "Package 1" since that will contain code used by a number of other packages. In the first instance, Package 1 will just include the existing parsers (for lack of a better term) in om3-utils. Does anyone have a good suggestion for a name for this package? How does access-input-parsers sit with people (yeah, I don't like it either)?

@dougiesquire
Copy link
Collaborator Author

How does access-input-parsers sit with people (yeah, I don't like it either)?

@micaeljtoliveira, @manodeep, @chrisb13, thoughts/opinions?

@micaeljtoliveira
Copy link
Contributor

How does access-input-parsers sit with people (yeah, I don't like it either)?

@dougiesquire I don't like it either, but not sure I can propose something better.

Reading again the description of the proposed packages, I realise we don't really have a place to put parsers that handle the text based output of the models. Not sure how much we could/would extract from the run logs, but they do contain potentially useful information. Maybe one could simply bundle such parsers with the input ones and then just call the package access-parsers. That's a name I like a bit more than access-input-parsers.

@dougiesquire
Copy link
Collaborator Author

Fine with me. Any objections to access-parsers from anyone?

@manodeep
Copy link

I like access-parsers better as a name.

One separate consideration comes to mind - would it be worthwhile to add some extra word to repo-names to highlight which repos are primarily internal facing (and that APIs/directory layout might change at the drop of a hat)? For example, adding “dev” in the repo name or something to that effect

@anton-seaice
Copy link
Contributor

Are the parsers specific to nuopc/cmeps based models? e.g. access3-parsers

@micaeljtoliveira
Copy link
Contributor

Are the parsers specific to nuopc/cmeps based models? e.g. access3-parsers

No, we will also include OASIS-based models, like ESM1.6 and OM2.

@chrisb13
Copy link
Contributor

Thanks for taking this forward @dougiesquire.

access-parsers sounds ok (suitably flexible but limited scope) to me. Once abstracted, might be good to get input from @micaeljtoliveira @manodeep @minghangli-uni @aekiss that the parsers themselves will meet everyone's needs?

On this earlier post's content. What is the thinking for make_diag_table needing to be it's own package? And would it work across ESM, OM3, CM etc? I'm wondering if we may eventually want to have similar helpers for other components (e.g. CICE). A problem for later in any case.

@dougiesquire
Copy link
Collaborator Author

Thanks all. I'll set up the initial access-parsers package in the ACCESS-NRI org when I can find some time to do so.

What is the thinking for make_diag_table needing to be it's own package? And would it work across ESM, OM3, CM etc?

I thought that's what was agreed in the meeting. Making it its own package fits with our preference for many small packages with well-defined scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants