Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use glob function to allow wild character in template #2746

Merged
merged 4 commits into from
Apr 16, 2024

Conversation

metdyn
Copy link
Contributor

@metdyn metdyn commented Apr 15, 2024

Types of change(s)

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Trivial change (affects only documentation or cleanup)

Checklist

  • Tested this change with a run of GEOSgcm
  • Ran the Unit Tests (make tests)

Description

In the current sampler code (interpolation for each time interval along model run), reading a filename with template sometimes requires us to rename the observation file name to concord with the template. For example,

template = MYD04_L2.A%y4%D3.%h2%n2.hdf
obs file = MYD04_L2.A2019262.2305.061.2019263152708.hdf
requiring user to either link the file (or rename it)
MOD04_L2.A2019212.2330.hdf -> MOD04_L2.A2019212.2330.061.2019213072220.hdf

This is simplified, if wild character is allowed

template = MYD04_L2.A%y4%D3.%h2%n2*.hdf

@amdasilva suggested using glob function to enable * character in template.
This PR is a code implementation.

Example input:

PLATFORM.airs_aqua::
    file_name_template:  airs_aqua.%y4%m2%d2T%h2%n2*.nc4
PLATFORM.amsr2_gcom-w1::
    file_name_template: amsr2_gcom-w1.%y4%m2%d2T%h2%n2*.nc4

Sampler code output:

        sampler: DEBUG: non-exist: filename fid j  :          0
        sampler: DEBUG: non-exist: missing filename: /Users/yyu11/ModelData/JEDI/ioda/Y2019/M08/D01/H00_06/airs_aqua.20190731T2100*.nc4
        sampler: DEBUG: non-exist: filename fid j  :          1
        sampler: DEBUG: non-exist: missing filename: /Users/yyu11/ModelData/JEDI/ioda/Y2019/M08/D01/H00_06/airs_aqua.20190801T0300*.nc4
        sampler: DEBUG: exist: filename fid j      :          0
        sampler: DEBUG: exist: true filename       : /Users/yyu11/ModelData/JEDI/ioda/Y2019/M08/D01/H00_06/amsr2_gcom-w1.20190731T210000Z.nc4
        sampler: DEBUG: exist: filename fid j      :          1
        sampler: DEBUG: exist: true filename       : /Users/yyu11/ModelData/JEDI/ioda/Y2019/M08/D01/H00_06/amsr2_gcom-w1.20190801T030000Z.nc4
        sampler: DEBUG: nobs from input file:        12584

Related Issue

@metdyn metdyn requested review from a team as code owners April 15, 2024 15:36
@metdyn metdyn added the 0 Diff Trivial The changes in this pull request are trivially zero-diff (documentation, build failure, &c.) label Apr 15, 2024
@mathomp4
Copy link
Member

mathomp4 commented Apr 15, 2024

Would it be possible to make unit test(s) for this? Only because I'd want to make sure that Intel and GNU both work with this. (Since Fortran + Strings = 🤷🏼 sometimes as @tclune and @darianboggs can attest to!)

@metdyn
Copy link
Contributor Author

metdyn commented Apr 15, 2024

Would it be possible to make unit test(s) for this? Only because I'd want to make sure that Intel and GNU both work with this. (Since Fortran + Strings = 🤷🏼 sometimes as @tclune and @darianboggs can attest to!)
Thanks for reminding, I have
1) gcc-gfortran/12.3.0 2) openmpi/5.0.0rc12 3) Baselibs/7.17.1 on mac,
Then under build dir, I run ctest and I get the following, which does not seem to be related to this PR. Do you get the same output for ctest?

97% tests passed, 2 tests failed out of 66

Label Time Summary:
ESSENTIAL = 118.08 secproc (56 tests)
EXTDATA1G_BIG_TESTS = 1.69 sec
proc (1 test)
EXTDATA2G_BIG_TESTS = 1.70 secproc (1 test)
PERFORMANCE = 2.95 sec
proc (6 tests)

Total Test time (real) = 125.43 sec

The following tests FAILED:
25 - ExtData1G_case12 (Failed)
49 - ExtData2G_case12 (Failed)

@mathomp4
Copy link
Member

Would it be possible to make unit test(s) for this? Only because I'd want to make sure that Intel and GNU both work with this. (Since Fortran + Strings = 🤷🏼 sometimes as @tclune and @darianboggs can attest to!)
Thanks for reminding, I have
1) gcc-gfortran/12.3.0 2) openmpi/5.0.0rc12 3) Baselibs/7.17.1 on mac,
Then under build dir, I run ctest and I get the following, which does not seem to be related to this PR. Do you get the same output for ctest?

97% tests passed, 2 tests failed out of 66

Label Time Summary: ESSENTIAL = 118.08 sec_proc (56 tests) EXTDATA1G_BIG_TESTS = 1.69 sec_proc (1 test) EXTDATA2G_BIG_TESTS = 1.70 sec_proc (1 test) PERFORMANCE = 2.95 sec_proc (6 tests)

Total Test time (real) = 125.43 sec

The following tests FAILED: 25 - ExtData1G_case12 (Failed) 49 - ExtData2G_case12 (Failed)

Ah. You'd want to run:

ctest -L 'ESSENTIAL'

which just runs the essential cases. This is what happens if you run make tests we just run those.

The reason Case 12 fails is that Case 12 requires 216 cores so...big! (I wonder if there's a way to have ctest avoid that if there aren't enough cores available...)

@metdyn
Copy link
Contributor Author

metdyn commented Apr 15, 2024

@mathomp4 , that is really helpful!
ctest -L 'ESSENTIAL'
100% tests passed, 0 tests failed out of 56.

I have checked the box in the CheckList Run the unit test (make tests)

@mathomp4
Copy link
Member

Did you add a unit test for your code? Or is that sort of...out of reach? Maybe the model is the unit test!

base/MAPL_ObsUtil.F90 Outdated Show resolved Hide resolved
base/MAPL_ObsUtil.F90 Outdated Show resolved Hide resolved
base/MAPL_ObsUtil.F90 Outdated Show resolved Hide resolved
base/MAPL_ObsUtil.c Outdated Show resolved Hide resolved
Copy link
Collaborator

@tclune tclune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline comments. Mostly small code improvements, but one substantial question that needs to be answered.

@tclune
Copy link
Collaborator

tclune commented Apr 15, 2024

Did you add a unit test for your code? Or is that sort of...out of reach? Maybe the model is the unit test!

Ideally unit tests won't access the filesystem, so a bit tricky here. Let's not let this get in the way of approval. But the follow up will be to work with @darianboggs to add some sort of basic sanity check. I guess create a directory with some empty files with various names and then see what glob returns?

base/MAPL_ObsUtil.F90 Outdated Show resolved Hide resolved
@metdyn
Copy link
Contributor Author

metdyn commented Apr 16, 2024

@mathomp4, thanks for the suggestion. The full sampler code running with all observation locations is one of the tests (I have executed a part of it).

@tclune tclune self-requested a review April 16, 2024 15:20
base/MAPL_ObsUtil.F90 Outdated Show resolved Hide resolved
@bena-nasa bena-nasa merged commit 4483883 into develop Apr 16, 2024
34 checks passed
@tclune tclune deleted the feature/ygyu/glob branch April 16, 2024 16:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 Diff Trivial The changes in this pull request are trivially zero-diff (documentation, build failure, &c.)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants