Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raise an error if no value can be found for a namelist variable #3988

Merged
merged 6 commits into from
Jun 28, 2021

Conversation

billsacks
Copy link
Member

Previously, if no value was found for a namelist variable, the code simply put an empty string in as the value, without raising any errors. This PR adds error checking.

Fixing this issue revealed some other latent issues, which were being masked by this one. So this PR also fixes those other latent issues. See #3984 for extensive details.

Test suite:

  • Manual testing of the situation described in Missing default namelist values should raise an error but do not #3984
  • scripts_regression_tests on cheyenne
  • cime_developer tests on cheyenne, with baseline comparisons
  • CESM's aux_cime_baseline tests on cheyenne (mct & nuopc), with baseline comparisons
  • CESM's prealpha tests on cheyenne and izumi (mct & nuopc), with baseline comparisons

Test baseline: cesm2_3_alpha03a
Test namelist changes: Adds pio_netcdf_format for ESP, which was previously undefined; other than that, namelists are identical to baseline
Test status: bit for bit

Fixes #3984

User interface changes?: N

Update gh-pages html (Y/N)?: N

billsacks added 6 commits June 3, 2021 20:52
The check here
<https://github.com/ESMCI/cime/blob/8ec59c02d0042b9e1e134ffbeba9fccb1f31a061/scripts/lib/CIME/nmlgen.py#L646>
was not being triggered when it should, because None values were being
converted to empty strings. This meant that, if no default value could
be found, instead of aborting, the code was writing out 'variable = ' to the
namelist file (rather than 'variable = value' for some value).

The changes in this commit distinguish between two possible causes of
None values coming from entry_id's get_value_match:
1. No match found: in this case, we want to keep the value as None so
   that it triggers a later error check (in contrast to what was being
   done before).
2. A match is found, but self.text(mnode) returns None: I'm not sure
   when (if ever) this is triggered, but I did not want to change the
   logic for this situation, so I have introduced an optional argument,
   replacement_for_none, which maintains the old logic (replacing None
   with an empty string) in this case.

The changes in namelist_definition:get_default_value are somewhat
unrelated: The original motivation was seeing that self.get_value_match
can now return None; but as I looked closely, I saw that there was a
double-call to self._split_defaults_text(value) that looks like a bug,
so I have removed that double-call. I don't see any calls to this
get_default_value routine from within cime, but it's possible that
components use it in their buildnml.

Resolves ESMCI#3984 - see that issue for more details.
With the changes in the last commit to fix ESMCI#3984, this missing
default now led to an error. Previously, ESP's pio_netcdf_format was
being set to an empty string. I'm not positive that the way I've set it
is correct, but it seems more correct than before.
(1) If a variable is specified in the user_nl file (the "infile" in
    nmlgen), then set have_value = True to avoid triggering an error: as
    long as a value is specified in the user_nl file, then it's okay if
    no default value exists. See
    ESMCI#3984 (comment) for
    detailed thoughts.

(2) Change Namelist.get_variable_value to return a copy rather than a
    reference to the value list. See
    ESMCI#3984 (comment) for
    detailed thoughts.
@billsacks
Copy link
Member Author

I have done extensive testing of this, but I would be happy to have a critical eye on this, given the subtleties in the namelist generation code.

@jgfouca I found a few places where defaults weren't being set correctly in CESM, which was ignored before this but picked up after this PR. So it could be worth running some E3SM testing before merging this, or at least keeping an eye out for issues after this is merged – both to catch pre-existing problems with default values not being available, and also in case these changes inadvertently change any namelist defaults (they shouldn't, and don't appear to for CESM, but there's a chance that I missed some subtlety in the logic). If you run into problems, I'm happy to help resolve them.

Comment on lines +200 to +204
# NOTE(wjs, 2021-06-04) In the following call, replacement_for_none='' may not
# actually be needed, but I'm setting it to maintain some old logic, to be safe.
value = super(NamelistDefinition, self).get_value_match(vid.lower(),attributes=all_attributes, exact_match=exact_match,
entry_node=entry_node)
if value is None:
value = ''
else:
entry_node=entry_node,
replacement_for_none='')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I say in the comment, I don't know if this replacement_for_none is actually needed; I added it to be safe to maintain existing logic. @jedwards4b @jgfouca do you know off-hand if a None value can be generated from the text value retrieved by entry_id's get_value_match? If you'd like, I can do some testing of whether this is ever triggered in CESM testing and remove it if it isn't.

Also: I don't love that the interface of get_value_match now differs from the entry_id version (requiring #pylint: disable=arguments-differ), but this seemed better to me than other implementations I could think of. I'm open to suggestions if you see a better way to do this.

@@ -445,4 +453,4 @@ def get_default_value(self, item, attribute=None):
all_attributes.update(attribute)

value = self.get_value_match(item.lower(), all_attributes, True)
return self._split_defaults_text(value)
return value
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original motivation for this change was seeing that self.get_value_match can now return None; but as I looked closely, I saw that there was a double-call to self._split_defaults_text(value) (at the end of self.get_value_match and here) that looks like a bug, so I have removed that double-call. I don't see any calls to this get_default_value routine from within cime, but it's possible that components use it in their buildnml.

@@ -172,6 +172,7 @@
<value component="glc">$GLC_PIO_NETCDF_FORMAT</value>
<value component="wav">$WAV_PIO_NETCDF_FORMAT</value>
<value component="iac">$IAC_PIO_NETCDF_FORMAT</value>
<value component="esp">$ESP_PIO_NETCDF_FORMAT</value>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is the right setting: other esp pio values seem to be set to some "undefined" value. But this seemed more right than the earlier behavior, where this was left as blank in the namelist file.

@jgfouca
Copy link
Contributor

jgfouca commented Jun 9, 2021

Thanks, @billsacks . I will keep this in mind the next time I update CIME for E3SM.

jedwards4b added a commit to ESCOMP/CMEPS that referenced this pull request Jun 10, 2021
Fix namelist variables that weren't being set
### Description of changes

With my upcoming fix for ESMCI/cime#3984, I
was getting some failures due to namelist variables that weren't
properly defined.

(1) mediator_read_restart: since this is set in the code rather than via
    the normal mechanism, I think it's appropriate to have
    skip_default_entry="true" here.

(2) ESP's pio_netcdf_format: I'm not positive that the way I've set it
    is correct, but it seems more correct than before.

### Specific notes

Contributors other than yourself, if any: none

CMEPS Issues Fixed (include github issue #): none

Are changes expected to change answers?
 - [x] bit for bit
 - [ ] different at roundoff level
 - [ ] more substantial

Any User Interface Changes (namelist or namelist defaults changes)?
 - [ ] Yes
 - [x] No

Testing performed if application target is CESM:(either UFS-S2S or CESM testing is required):
- [ ] (recommended) CIME_DRIVER=nuopc scripts_regression_tests.py
   - machines:
   - details (e.g. failed tests):
- [ ] (recommended) CESM testlist_drv.xml
   - machines and compilers:
   - details (e.g. failed tests):
- [x] (optional) CESM prealpha test
   - cheyenne & izumi (all compilers)
   - details (e.g. failed tests): only test failures are ones that seem to have failed in the baseline, except for a couple of tests that looked like system issues. NLCOMPs identical except for the expected change in pio_netcdf_format for ESP
- [x] (other) please described in detail: CESM's aux_cime_baselines
   - cheyenne (all compilers for which tests are defined)
   - details (e.g. failed tests): all pass; NLCOMPs identical except for the expected change in pio_netcdf_format for ESP

Testing performed if application target is UFS-coupled:
- [ ] (recommended) UFS-coupled testing
   - description:
   - details (e.g. failed tests):

Testing performed if application target is UFS-HAFS:
- [ ] (recommended) UFS-HAFS testing
   - description:
   - details (e.g. failed tests):

Hashes used for testing:
- [x] CESM:
  - repository to check out: https://github.com/ESCOMP/CESM.git
  - cesm2_3_alpha03a, with the cime branch in ESMCI/cime#3988
- [ ] UFS-coupled, then umbrella repostiory to check out and associated hash:
  - repository to check out:
  - branch:
  - hash:
- [ ] UFS-HAFS, then umbrella repostiory to check out and associated hash:
  - repository to check out:
  - branch:
  - hash:
@jedwards4b jedwards4b merged commit de588aa into ESMCI:master Jun 28, 2021
@billsacks billsacks deleted the error_for_missing_nml_default branch June 29, 2021 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing default namelist values should raise an error but do not
3 participants