Diagnostics Downsampling by a factor 2 for MOM6 #869

nikizadehgfdl · 2018-10-25T15:34:27Z

@adcroft @Hallberg-NOAA @StephenGriffies This pull request is solely to ease the review process. Please review the code change (mainly one file), particularly please review the main algorithm starting at line 3644 of MOM_diag_mediator.F90 !

This code update brings in a preliminary capability for diagnostics decimation. Current limitations are (but may not be limited to)

Commensurate layouts only. The layout should be chosen smartly so that the coarsened cells do not cross the processor boundary, i.e., NIGLOBAL/layout_x and NJGLOBAL/layout_y should both be divisible by the decimation level dl (if such diagnostics is present in diag_table). It's hard to lift this limitation, and it may not be desirable due to halo updates slowing down the model and beating the purpose of this project.
Only dl=2 support is coded. It is easy to extend this to include dl=4, but for a general dl we need to design an array of grids so that we have access to e.g., G(dl=6)%isc, etc

Good news is that this update seems to be working, reducing the size of history files significantly and without an obvious performance hit.
You can preview and explore some of results via this notebook:
https://github.com/nikizadehgfdl/grid_generation/blob/master/jupynotebooks/DiagnosticsDataDecimation.ipynb

- To produce the full diagnostics for 1/8 degree model it is needed to reduce the size of output files. This could be done by "averaging" over a few neighboring grid cells and output the resulting fields on the reduced domain. That's what we call decimation and is the purpose of this project branch.

- Prototype zaps all diagnostics by a factor of 2 - Works only for the native grid diagnostics - _z diagnostics complain about the local mask array index

- Next: make diag decimation optional at diag_table level

- This update allows the use to request a level 2 decimated diagnostics in the diag_table as following example shows OMp5 1900 1 1 0 0 0 "ocean_hour", 0, "days", 1, "days", "time" "ocean_model", "tos", "tos", "ocean_hour", "all", "mean", "none",2 "ocean_model", "thetao", "thetao", "ocean_hour", "all", "mean", "none",2 "ocean_model", "umo", "umo", "ocean_hour", "all", "mean", "none",2 "ocean_model", "vmo", "vmo", "ocean_hour", "all", "mean", "none",2 "ocean_model", "volcello", "volcello", "ocean_hour", "all", "mean", "none",2 # Cell measure for 3d data "ocean_hour_d2", 0, "days", 1, "days", "time" "ocean_model_d2", "tos", "tos", "ocean_hour_d2", "all", "mean", "none",2 "ocean_model_d2", "thetao", "thetao", "ocean_hour_d2", "all", "mean", "none",2 "ocean_model_d2", "umo", "umo", "ocean_hour_d2", "all", "mean", "none",2 "ocean_model_d2", "vmo", "vmo", "ocean_hour_d2", "all", "mean", "none",2 "ocean_model_d2", "volcello", "volcello", "ocean_hour_d2", "all", "mean", "none",2 # Cell measure for 3d data - At the moment it works only for "Native" grid diagnostics and level 2 decimation (bination?) - It has to be extended to non-native diagnostics, e.g., "ocean_model_z_d2", "tos", "tos", "ocean_hour_z_d2", "all", "mean", "none",2 - It has to be extended to arbitrary level of decimation, e.g., "ocean_model_z_d4", "tos", "tos", "ocean_hour_z_d4", "all", "mean", "none",2 "ocean_model_z_d2", "tos", "tos", "ocean_hour_z_d2", "all", "mean", "none",2 - Also, note that this prototype only works for smart choices of layouts where "combined" cells are on the same pe We need a major design revision to extend this to arbitrary layouts that would need halo updates and halo handling.

- This update allows using non-native and decimated diagnostics as well as their combinations. E.g., it works for a diag_table as shown below. - I have to validate with a full diagnostics validate individual diagnostics make sense study the memory foot print to make sure the decimate rotuines have no leak (due to extensive use of fortran pointers) - Also we have to work on an averaging rather than sub-sampling of the fields as is done in this prototype OM5p5 1900 1 1 0 0 0 "ocean_hour", 0, "days", 1, "days", "time" "ocean_model", "tos", "tos", "ocean_hour", "all", "mean", "none",2 "ocean_model", "thetao", "thetao", "ocean_hour", "all", "mean", "none",2 "ocean_model", "umo", "umo", "ocean_hour", "all", "mean", "none",2 "ocean_model", "vmo", "vmo", "ocean_hour", "all", "mean", "none",2 "ocean_model", "volcello", "volcello", "ocean_hour", "all", "mean", "none",2 # Cell measure for 3d data "ocean_hour_d2", 0, "days", 1, "days", "time" "ocean_model_d2", "tos", "tos", "ocean_hour_d2", "all", "mean", "none",2 "ocean_model_d2", "thetao", "thetao", "ocean_hour_d2", "all", "mean", "none",2 "ocean_model_d2", "umo", "umo", "ocean_hour_d2", "all", "mean", "none",2 "ocean_model_d2", "vmo", "vmo", "ocean_hour_d2", "all", "mean", "none",2 "ocean_model_d2", "volcello", "volcello", "ocean_hour_d2", "all", "mean", "none",2 # Cell measure for 3d data "ocean_hour_z", 0, "days", 1, "days", "time" "ocean_model_z", "thetao", "thetao", "ocean_hour_z", "all", "mean", "none",2 "ocean_model_z", "umo", "umo", "ocean_hour_z", "all", "mean", "none",2 "ocean_model_z", "vmo", "vmo", "ocean_hour_z", "all", "mean", "none",2 "ocean_model_z", "volcello", "volcello", "ocean_hour_z", "all", "mean", "none",2 # Cell measure for 3d data "ocean_hour_z_d2", 0, "days", 1, "days", "time" "ocean_model_z_d2", "thetao", "thetao", "ocean_hour_z_d2", "all", "mean", "none",2 "ocean_model_z_d2", "umo", "umo", "ocean_hour_z_d2", "all", "mean", "none",2 "ocean_model_z_d2", "vmo", "vmo", "ocean_hour_z_d2", "all", "mean", "none",2 "ocean_model_z_d2", "volcello", "volcello", "ocean_hour_z_d2", "all", "mean", "none",2 # Cell measure for 3d data

- The design of decimating subroutines with pointer manipulations was bad and causing memory leak. Using "allocatable" arrays instead is not as elegant but avoids memory leaks at the cost of bringing a few lines of code fo allocating temporary arrays outside the decimating subroutines. The FORTRAN garbage collection takes care of deallocating the "allocatable"s when their scope ends (unlike pointers).

- This update introduces aggregation methods, so that we can point average the fields rather than subsampling. This cab be extended to fancier methods such as area or volume averaging

- The masks for non-native decimated diags were not set right - Some cleanup of the code to consolidate new calls - Note that locmask => NULL() shoulbe in the body of subroutines not in the definition section. If it is in the definition section it is set to null only on the first entry (it is automatically "save"ed) and on subsequent entry it is whatever it was the last time.

- All decimated axes need to have the non-decimated mask3d fields initialized correctly. The non-decimated masks are being used in the decimation algorithm for the diagnostics fields

- According to Alistair, the decimation method could be solely deduced from the axes%x_cell_method, axes%y_cell_method and probably the area_cell_method at the time of send_data - This is the summary of the algoritm f(Id,Jd) = \sum_{i,j} f(Id+i,Jd+j) * weight(Id+i,Jd+j) / [ \sum_{i,j} weight(Id+i,Jd+j)] Id,Jd are the decimated (coarse grid) indices run over the coarsened compute grid, i and j run from 0 to dl-1 (dl being the decimation level) if and jf weight(if,jf) run over the original fine computre grid x_cell_method y_cell_method area_cell_method weight(if,jf) example --------------------------------------------------------------------- ------------- mean mean mean A(if,jf)*h(if,jf) theta point mean mean dy(if,jf)*h(if,jf) u mean point mean dx(if,jf)*h(if,jf) v mean mean sum A(if,jf) h*theta sum sum sum 1 volcello point sum sum 1 umo

- This commit extends the proposed decimatipn algorithm to cover all the present diagnostics in the OM4_025 diag_table There may be more cases that need to be coded up later

- Beware! Currently only commensurate layouts are supported. I.e., the decimated subgrid cells should all be contained in the same core (pe). For this to happend the layout of the model runs should be chosen so that NIGLOBAL/layout_x and NJGLOBAL/layout_y are both divisible by dl (decimation level) if a _dl diagnostics is present in the diag_table - This is a major limitition of the current implementation. But the extension to arbitrary layouts would required cross processor communications (halo updates) which may slow down the model considerably and beat the purpose of decimation.

ashao · 2018-10-25T16:22:59Z

Thanks @nikizadehgfdl for implementing this an exciting new capability.

I'll try to look at this a little more carefully later, but here are some of my initial comments based on a brief scan of the code. (I'll also include these at the appropriate points in th PR as comments)

It looks like the specification of method could be replaced using because the information in the x,y,v cell methods is sufficient to determine which 'method' of decimation/aggregation.
There still is the problem of continental boundaries, to which I have no good solution.
Reduce the amount of duplicated code and aid future development by putting the calculation of the decimated grid value into a function. Maybe decimate_field_3d could also use decimate_field_2d?

balaji-gfdl · 2018-10-25T19:15:59Z

Can I just state for the record that decimation specifically means removal of a tenth? (ie *0.9)? It comes from a form of punishment in Roman armies, where they would randomly kill one out of every 10 soldiers.

I am (quite seriously) suggesting we use an accurate technical term, rather than an inaccurate metaphorical term: the correct term for what we are doing is "downsampling", which is precise and unambiguous.

adcroft

What does zap2 abbreviate?

src/core/MOM_grid.F90

adcroft · 2018-10-25T20:09:36Z

src/framework/MOM_domains.F90

@@ -1566,6 +1569,28 @@ subroutine MOM_domains_init(MOM_dom, param_file, symmetric, static_memory, &
    endif
  endif

+  global_indices(1) = 1 ; global_indices(2) = int(MOM_dom%niglobal/2)
+  global_indices(3) = 1 ; global_indices(4) = int(MOM_dom%njglobal/2)
+  xhalo_zap2 = int(MOM_dom%nihalo/2)


I recommend a halo of 1 (or 0?) since we're not doing wide-stencil computations. It should not change anything.

Unfortunately this does not work. I have to choose the MOM_dom%nihalo/dl otherwise the checks for downsampled field size complain that it does not match the expected size of original_dszi/dl .

adcroft · 2018-10-25T20:13:13Z