Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Second rules update in v1 branch, update names, update write_standard_name_table.py to allow for subsections #87

Open
wants to merge 19 commits into
base: release/v1
Choose a base branch
from

Conversation

mkavulich
Copy link
Collaborator

This is the second round of proposed rules changes based on our ongoing discussion to make both rules and names as consistent as possible. Major updates include:

Standard Names

  • New "base_names" section, with subsections
  • Convert instances of "surface_X" to "X_at_surface"
  • Convert some instances of "air_X" or "X_of_air"
  • Explicitly state what mass mixing ratios are with respect to in long_name
  • Convert "mixing_ratio" to "water_vapor_mixing_ratio_wrt_moist_air"
  • Convert "surface_albedo" to simply "albedo", "surface_roughness_length" to simply "roughness_length"

Rules

  • Update construction template to include [non-instant time] and [non-current time], swap [at level] and [in medium]
  • More detailed description of transformations and how they can work on multiple base names
  • Add a few more transformations

write_standard_name_table.py

  • Updated to allow sub-sections in the standard names list

You can see these changes in the form of a Google doc for better visualization here: https://docs.google.com/document/d/19ysUCWDhv53W8fQbElW7opr_1Pm7ck95QRUyKM_qy4E/edit?tab=t.0

Note: with this update, we have 347 standard names that are fully compliant with the rules we have set out. A big portion of the remainder are "flag"-type variables, indices, etc, which are not fully accounted for in the rules yet.

long_name="Molecular oxygen, O₂">
</standard_name>
<standard_name name="ozone"
long_name="Ozone, O₃">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will these non-ASCII characters work? (also line 246)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

long_name should be able to handle any unicode characters since it is not used in any FORTRAN code context. Even if long names were eventually used in comments or even character variables, that would be fine according to the FORTRAN 90 standard*, so long as it's a character set supported by the OS. I think anything in UTF-32 should be allowed for long names, since Python can handle those characters and it's essentially universally supported among OS's.

This does bring up the fact that we haven't defined a character set for the standard names; I'll add that to the to-do list. So long as we make anything that may possibly be used as a fortran variable/subroutine/other code strictly within the allowed Fortran character set, we should be good.

  • The relevant section on page 18 is here:

Additional characters may be representable in the processor, but may appear only in comments (3.3.1.1,
15 3.3.2.1), character constants (4.4.4), input/output records (9.1.1), and character string edit descriptors
16 (10.2.1).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we talk about this at the next standard names meeting, please? My immediate reaction is to stick with the basic ASCII characters and nothing else, but maybe I can be convinced otherwise ;-) At the very least, we need to run a test with CCPP (SCM, UFS, ...) if the parsers (prebuild and capgen) can handle those characters.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From today's discussion, we will restrict standard names (including long names) to ASCII character set.

@@ -2311,7 +2831,8 @@
<standard_name name="upward_virtual_potential_temperature_flux">
<type kind="kind_phys" units="K m s-1">real</type>
</standard_name>
<standard_name name="surface_upward_specific_humidity_flux_for_mellor_yamada_janjic_surface_layer_scheme">
<standard_name name="upward_flux_of_water_vapor_mixing_ratio_wrt_moist_air_at_surface_for_mellor_yamada_janjic_surface_layer_scheme"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably define MYJ as the abbreviation to use in the standard name, and the full name in the long name? Like PBL and GWD in my open PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good idea, but I will add it to my next round of changes to avoid scope creep. There are probably more abbreviations to introduce beyond this, like MYNN.

<type kind="kind_phys" units="Pa">real</type>
</standard_name>
<standard_name name="control_for_surface_layer_evaporation">
<type kind="kind_phys" units="1">real</type>
</standard_name>
<standard_name name="surface_specific_humidity_for_myj_schemes">
<standard_name name="water_vapor_mixing_ratio_wrt_moist_air_at_surface_for_myj_schemes"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, MYJ should be uppercase in the standard name, and spelled out in the long name

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rule 17 says names are insensitive, does it matter here?
If so, maybe we should add something and uppercase the abbreviations in the Acronyms, Abbreviations, and Aliases section? Or is it just for initials of names?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That (rule 17) is a good point I completely forgot about. I like consistency and would prefer either lowercase or uppercase. We can talk about it at one of the next meetings, if the majority thinks that case insensitive is fine, then I can live with that, too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will endorse the "case insensitivity" rule.

<type kind="kind_phys" units="fraction">real</type>
</standard_name>
<standard_name name="lwe_surface_snow_from_coupled_process">
<type kind="kind_phys" units="m">real</type>
</standard_name>
<standard_name name="surface_upward_latent_heat_flux_from_coupled_process">
<standard_name name="upward_latent_heat_flux_at_surface_from_coupled_process">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between for_coupling and from_coupled_process (not for vs from, but coupling vs coupled process)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good point: In general I think the use of "coupling" and "coupled" in variable names is ambiguous and problematic. Probably needs a deeper dive after this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Copy link
Collaborator Author

@mkavulich mkavulich Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will use "for coupling" suffix, and include a detailed description of what that means (pending further potential discussion).

Also potentially need new suffixes depending on CESM use case, dependent on group names? Maybe drop "coupling" all together in favor of "timing"/"order" suffixes

Copy link
Collaborator

@gold2718 gold2718 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great (and great big) piece of work!
A few requested changes and a lot of questions along with a few suggestions.
Also, should all long names begin with an capitalized word? Right now, there is a mix.

for std_name in sec:
if std_name.tag == 'section':
parse_section(snl, std_name, level + '#')
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use continue instead of a regular else clause?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems much simpler to reduce indentation for the rest of the logic (which is already fairly large)

also be considered standard names on their own. See the\n
full list of standard names for further details.\n">
<standard_name name="absolute_vorticity"
long_name="absolute_vorticity">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The long names here and below are supposed to be descriptive and do not need the underscores. If you are just going to copy the standard name, please omit the long name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention is to include long names for all the variables in this section, but I hadn't gotten around to it for this round of commits. I can remove them for now if it's a big deal but I just wanted to note this is not intended to be the "final" version.

@@ -2431,7 +2962,8 @@
<standard_name name="weight_for_potential_temperature_at_top_of_viscous_sublayer">
<type kind="kind_phys" units="1">real</type>
</standard_name>
<standard_name name="weight_for_specific_humidity_at_top_of_viscous_sublayer">
<standard_name name="weight_for_water_vapor_mixing_ratio_wrt_moist_air_at_top_of_viscous_sublayer"
long_name="Weight for specific humidity (water vapor mass mixing ratio with respect to moist air) at the top of the viscous sublayer">
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this "weight" as in "weighting factor"? If so, could "weighting factor" be used in the long name? If not, why are the units "1"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like "weight" is used to mean "weighting factor" in all current contexts. I'm not sure if referencing physical "weight" would ever be useful in standard names, but maybe it's worthy of discussion?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed at today's meeting that "weight" is a synonym for "scaling factor" in all existing uses, so we will convert these to "scaling factor".

<standard_name name="direct_uv_and_vis_shortwave_flux"
long_name="direct_uv_and_vis_shortwave_flux">
</standard_name>
<standard_name name="direct_visible_albedo"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason there are some uses of visible instead of vis?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out inconsistency, I will add visible --> vis to my next round of changes.

Copy link
Collaborator Author

@mkavulich mkavulich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gold2718 @climbfuji Thanks for your thorough reviews. I've addressed most of your comments/suggestions, and those that I haven't I replied with a follow-up comment or question. Let me know if anything else needs resolving.

Copy link
Collaborator

@nusbaume nusbaume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @mkavulich! I had a couple of change requests, but nothing that requires a full re-review.

comment="These names are used as bases for other names, but may\n
also be considered standard names on their own. See the\n
full list of standard names for further details.\n">
<standard_name name="absolute_vorticity"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these can be considered formal standard names should we include the units here as well? (I'm happy to hold this off for a future PR if that would be easier).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants