Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of Fx variables as regular variables for CMIP5 and CMIP6 WITH CMOR checks #1037

Conversation

valeriupredoi
Copy link
Contributor

should address comments and concerns from #901 and standardize fx files finding for CMIP6

@zklaus
Copy link

zklaus commented Apr 29, 2019

I suggest to rename fxdir to mip and fx_var to short_name. Then we can use anchors to get the paths directly from the data configuration.

@valeriupredoi
Copy link
Contributor Author

was doing it while you were writing your comment, dude 😁

@valeriupredoi valeriupredoi requested a review from bouweandela May 1, 2019 11:20
@valeriupredoi
Copy link
Contributor Author

OK so I've implemented the fx variable retrieval as a regular CMOR variable with its files and all other properties that a regular variable has; this implies that fx var files will have to be listed in the variables section of the diagnostic with the default preprocessor eg:

    variables:
      tas:
        preprocessor: pp_rad
        mip: Amon
        grid: gr
        fx_files: ['areacello']
      areacello:
        fxvar: True
        grid: gn
        mip: Ofx

Note that:

  • the mandatory identification of the variable to be an fx variable is done via the fxvar: True flag
  • grids need to be specified since lo and behold, they sometimes differ from the other variables' ones (if it's ocean then they are 'gn' and if all other vars are ocean then there is no need to specify grid in the fx var)

With this, the preproc output dir will contain, on top of all the other diag vars, CMOR-checked files for the needed fx vars, which is good. What's bad is that some of these vars don't have CMOR table entries (like areacello) and will need custom CMOR files; they may also need fix files, dunno yet.

One thing I couldn't manage for the life of me is to assign these preproc-ed (CMOR checked) fx files to the variable['fx_files'] instead of (currently) just retrieving the files from the original location (which works now, but it's not what I want) - @bouweandela I needs your help here, man! What's the best strategy to allow for this since the preprocessor steps are independent and done per variable?

@zklaus
Copy link

zklaus commented May 2, 2019

OK so I've implemented the fx variable retrieval as a regular CMOR variable with its files and all other properties that a regular variable has; this implies that fx var files will have to be listed in the variables section of the diagnostic with the default preprocessor eg:

    variables:
      tas:
        preprocessor: pp_rad
        mip: Amon
        grid: gr
        fx_files: ['areacello']
      areacello:
        fxvar: True
        grid: gn
        mip: Ofx

Note that:

the mandatory identification of the variable to be an fx variable is done via the fxvar: True flag

Maybe for consistencies sake that should be fx_var.

grids need to be specified since lo and behold, they sometimes differ from the other variables' ones (if it's ocean then they are 'gn' and if all other vars are ocean then there is no need to specify grid in the fx var)

The grid of a fx variable is always the same as the pertinent data variable. The example above is a bit strange in that it gives the cell areas for the ocean grid (areacell*o*) together with some atmospheric data. If you look at the corresponding areacell*a*, you will surely find the grid you expect.

Apart from that, areacella/o along with a few others are a bit special (see my last comment in the issue
https://github.com/ESMValGroup/ESMValTool/issues/901#issuecomment-487598899). Actually, allow me to quote myself:

Additionally, some of these are actually cell_measures, so the most elegant way, imho, to deal with them would be to add the fx files to the list of input filenames before they are loaded as cubes into iris.
Then, iris will automatically sort them out and make them available via cube.cell_measure().
Grepping the tables, this seems to apply for

  • areacella from fx
  • areacellg from IfxAnt and IfxGre
  • areacello from Ofx
  • volcello from either Ofx or (Omon, Oyr, Odec).
  • areacellr from fx

Also, if they are used in a .nc file, its global attributes will list them as external_variables in accordance with the relevant section of the CF conventions. So maybe we should be able to use that information automatically?

With this, the preproc output dir will contain, on top of all the other diag vars, CMOR-checked files for the needed fx vars, which is good. What's bad is that some of these vars don't have CMOR table entries (like areacello) and will need custom CMOR files; they may also need fix files, dunno yet.

Hm, areacello is in Ofx. Which ones don't have entries?

@valeriupredoi
Copy link
Contributor Author

hey K-man @zklaus good stuff, cheers! also my previous comment is getting obsolete now since I am gradually plugging in all the changes needed to make things work properly. Stay tuned 📻

@valeriupredoi
Copy link
Contributor Author

ok guys, so here it is: now we can do the following:

  • specify orography fx variables for both CMIP5 and CMIP6 in the recipe with as little as eg
    variables:
      tas:
        preprocessor: simple_preproc
        mip: Amon
      sftlf:
        fxvar: True
      sftof:
        fxvar: True
  • the code will retrieve the data from the DB, apply cmor checks and fixes, and will run the needed preprocessor steps that use the fx files with the cmor checked and fixed output files;
  • tasking is now done with priority ie if fx variables are present in a diagnostic, those will be processed first (preprocessor=default, basically just find data, load, fix and cmor check/fix; of course they can be run through a more complex preproc but I fail to see why) then the other variables will be run; this we we are certain that the output fx files (now stored in the preproc/ dir like any variable) will be there when needed by any given preprocessing or diag step; there is a bit of a wait on the main meat of preproc but these files are small and they are fast at preproccing;
  • I tested both with CMIP5 and 6 and all went well for me, there is a downside though - in cases like mask_landsea one needs to actually provide the fx variables in the recipe now (sftlf, stflof etc) becasue I could not work a way to add them to the list of variables without actually decalring them in the recipe - maybe @bouweandela has an idea?

Also @bouweandela - with this ine, the rather bulky implementation in #1009 is now obsolete; you will see now that _recipe.py does what its job is - just sets up a workflow of tasks and executes nothing internally 🍺

Another thing - now that the fx variables/files are available as regular variable files - how do we get about to including the cell measures in the actual variable? I gots no idea

@valeriupredoi valeriupredoi changed the title Fx files discovery for CMIP6 Handling of Fx variables as regular variables for CMIP5 and CMIP6 May 2, 2019
@valeriupredoi
Copy link
Contributor Author

Pending approval

@zklaus
Copy link

zklaus commented May 3, 2019

Good stuff, @valeriupredoi !

I wonder: Would it be possible/reasonable to forgo the fxvar in favor of the normal mip? Afterall, we know that fx files are exactly those that have fx in the name of their table. Or, even more explicit, we can give a list of tables of fixed files.

@valeriupredoi
Copy link
Contributor Author

OK guys a few (and final changes): as mentioned above the fx variables can now be standard diagnostic variables; changes of the last commit: it is now enough to specify a list of fx variables eg fx_files: ['sftlf', 'sftof'] in the variable and the code will pick up the fx files, CMOR check and fix them and then apply whatever mask is needed in the preprocessor; NOTE that if one only specifies the mask in the preproc and no fx settings in variable OR separate fx variables as stand alone diagnostic variables, the masking will fail. I need to update the doeumentation but after people test and also give me coding feedback.

@zklaus in essence this PR does exactly what you are saying - you can now just list whatever fx variables you need to include as diagnostic variables; I may drop the fxflag completely from the recipe, yes - the fx var needs an internal identification flag though because the DRS is different from a regular variable, and also it runs a very stripped-down preprocessor profile (eg no time selection, no regridding etc)

@zklaus
Copy link

zklaus commented May 7, 2019

Nice! In what way is the drs different? looking at your config-developer.yml, the dir part seems to be identical and the file part only differs in that the general files have _*.nc and the fx files have *.nc in the end. If that really is the only difference we might consider simply removing the extra _?

In this context, note the Section "File name template" (p. 14) in the relevant standards document.

@valeriupredoi valeriupredoi changed the title Handling of Fx variables as regular variables for CMIP5 and CMIP6 Handling of Fx variables as regular variables for CMIP5 and CMIP6 WITH CMOR checks May 12, 2019
@valeriupredoi
Copy link
Contributor Author

OK guys, this is finally done and tested and I swear it didn't take 2 weeks 😁

@bouweandela
Copy link
Member

Please re-open this pull request in the ESMValTool Core repository. You can do so by pushing this branch to that repository, using the following instructions.

First add the repository with

git remote add esmvalcore [email protected]:esmvalgroup/esmvalcore

or

git remote add esmvalcore https://github.com/esmvalgroup/esmvalcore

if you're not using ssh to connect to GitHub.

Next upload the branch with

git push esmvalcore your_branch

and open a pull request here

@valeriupredoi
Copy link
Contributor Author

ported to ESMValGroup/ESMValCore#19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants