Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit Explicit Vertical Advection (IEVA) #1373

Merged
merged 36 commits into from
Mar 22, 2021
Merged

Implicit Explicit Vertical Advection (IEVA) #1373

merged 36 commits into from
Mar 22, 2021

Conversation

louiswicker
Copy link
Contributor

@louiswicker louiswicker commented Jan 20, 2021

TYPE: new feature

KEYWORDS: IEVA, vertical advection

SOURCE: Louis Wicker (NOAA/NSSL)

DESCRIPTION OF CHANGES:

For grids with large aspect ratios (dx/dz >> 1) that permit explicit convection, the large time step is limited by the
strongest updraft that occurs during integration. This results in time step often 20-30% smaller, or requires the use
of w-filtering, such as latent-heat tendency limiting. Regions of large vertical velocities are also often very small
relative to the domain. The Implicit-Explicit Vertical Advection (IEVA) scheme has been implemented (see Wicker, L. J.,
and W. C. Skamarock, 2020: An Implicit–Explicit Vertical Transport Scheme for Convection-Allowing Models. Mon. Wea.
Rev., 148, 3893–3910) and that permits a larger time step by partitioning the vertical transport into an explicit piece,
which uses the normal vertical schemes present in WRF, and a implicit piece which uses implicit transport (which is
unconditionally stable). The combined scheme permits a larger time step than has been previously been used and
reduced w-filtering.

The scheme will be useful for CONUS scale CAM (convection allowing model) simulations (dx ~ 2-3 km) when the
number of vertical levels > 50. Time steps can increase to as large as 25 s, depending on the problem. The Wicker
and Skamarock paper demonstrates IEVA's advantages on the 27 April 2011 Alabama tornado outbreak by comparing
it to the operational CAM (the High Resolution Rapid Refresh) configuration. Results are shown that the HRRR
simulation is stable up to a dt=20 s and only with latent-heat limiting on. Using the IEVA scheme the dt can be increased
to 24 s and no latent-heat limiting is needed. Overall integration efficiency increases ~ 15%, and the IEVA solutions
are closer to a benchmark run using smaller dt (12 s) than the HRRR simulation.

LIST OF MODIFIED FILES:
M Registry/Registry.EM_COMMON
M dyn_em/Makefile
M dyn_em/module_advect_em.F
M dyn_em/module_big_step_utilities_em.F
M dyn_em/module_em.F
A dyn_em/module_ieva_em.F
M dyn_em/solve_em.F
M run/README.namelist
A test/em_real/namelist.input.IEVA.4km

TESTS CONDUCTED:

  1. This pull requested code has been tested repeatedly on 27 April for a 24 hour simulation with the parameters set as in the Wicker and Skamarock (2020) as well as a 10 step difference test with the em_quarter_ss case using a single node and 4 nodes. The differences in the results ~ 10^-5.

  2. Jenkins tests are all pass.

RELEASE NOTES: The Implicit-Explicit Vertical Advection (IEVA) scheme has been implemented (see Wicker, L. J., and W. C. Skamarock, 2020: An Implicit–Explicit Vertical Transport Scheme for Convection-Allowing Models. Mon. Wea. Rev., 148, 3893–3910) and that permits a larger time step by partitioning the vertical transport into an explicit piece, which uses the normal vertical schemes present in WRF, and a implicit piece which uses implicit transport (which is unconditionally stable). The combined scheme permits a larger time step than has been previously been used and reduced w-filtering. The scheme will be useful for CONUS-scale CAM (convection allowing model) simulations (dx ~ 2-3 km) when the number of vertical levels > 50. In these cases, time steps can increase to as large as 25 s, depending on the problem. Overall integration efficiency increases ~ 15%, and the IEVA solutions are closer to a benchmark run using smaller time step.

…k-scalar-tend subroutine (u_old, v_old) to pass through and use in ww_split. Also fixed loop bounds in the *_implicit routines.
…ed so that the IEVA parameter info prints out only when ieva == TRUE, so never if its off, and only on the 3rd RK step if its on.
@louiswicker louiswicker requested a review from a team as a code owner January 20, 2021 21:55
@louiswicker
Copy link
Contributor Author

louiswicker commented Jan 21, 2021 via email

@louiswicker
Copy link
Contributor Author

louiswicker commented Jan 21, 2021 via email

@louiswicker
Copy link
Contributor Author

louiswicker commented Jan 21, 2021 via email

…and superfluous argument in advect_scalar_weno
@weiwangncar
Copy link
Collaborator

@louiswicker @davegill The code base is master. We should change that to develop first.

@louiswicker
Copy link
Contributor Author

louiswicker commented Jan 21, 2021 via email

@davegill davegill changed the base branch from master to develop January 21, 2021 18:46
…and removed the superfluous c1/c2 arrays from several routines.
@davegill
Copy link
Contributor

Lou,
It looks like you picked up a version of the code and then copied that into a newer repo. Because of that, there are some changes to the "program feeder" that have gone in during the past few months that your PR overwrites. Basically, all of the missing C1, C2 args need to be replaced. For the rest IEVA mods, you are not on the hook to fix the program feeder. If you have, no troubles. However, no responsibilities.

@weiwangncar
Copy link
Collaborator

@Plantain Do you have gfortran on your system? If you do, compile the code with 'configure -D' and run it on your earliest failed case can be helpful.

@weiwangncar
Copy link
Collaborator

@louiswicker I am looking at your advect_*_implicit routines, and you have these variables declared with memory sizes (ims,ime, etc.): at, bt, ct, rt, bb, btmp. You pass these arrays to tridiag2d with memory sizes, but these arrays are only defined in loops with tile sized dimensions, i.e. its, ite or so. I wonder what would happen to the return array (bb) in the 'ghost' zone? In fact the loop inside tridiag2d is also tiled. So it seems some part of the memory-sized arrays are never defined. I could be wrong. But I just want to bring it up so that others can comment too.

@louiswicker
Copy link
Contributor Author

louiswicker commented Mar 13, 2021 via email

@Plantain
Copy link

Plantain commented Mar 13, 2021 via email

The variables wwE and wwI were initialized twice when ieva = FALSE.
Move the initialization into the ieva = TRUE loop (with appropriate
vertical loop bounds).

 Changes to be committed:
	modified:   dyn_em/module_ieva_em.F
The memory-sized local variable wwE and wwI were allocated on the stack
for rk_tendency, and for each of the multiple calls to rk_scalar_tend.
These are each called for every RK time step. The mods allocate the space
in the solver once per model time step, and since the variables are "i1",
they are not kept during a nested domain.

 Changes to be committed:
	modified:   Registry/Registry.EM_COMMON
	modified:   dyn_em/module_em.F
	modified:   dyn_em/solve_em.F
@davegill
Copy link
Contributor

@louiswicker @weiwangncar @dudhia
Folks,
My last two commits are for performance of the ieva=off code.

Screen Shot 2021-03-21 at 10 27 41 PM

This is a 6-h simulation, 3-km, near Houston / New Orleans. What is plotted is the non-radiation time step percent difference change compared to the has just before Lou's changes.

Top left: Lou's code before my mods
Top right: after the introduction of the new way to initialize wwE and wwI
Bottom left: (top right +) promoting wwE and wwI to "i1" in the Registry
Bottom right: (other mods +) complete replacement of the bigs-step_utilities with original code (not included in this proposal at this time)

With a combination of these two mods (wwE and wwI init, promote to "i1"), the IEVA code is about 1.5% slower than the original - when IEVA = FALSE. I find this acceptable.

@davegill
Copy link
Contributor

@louiswicker @weiwangncar @dudhia
Folks,

  1. This passes the reg tests
Please find result of the WRF regression test cases in the attachment. This build is for Commit ID: cc3c08d3c183fea5562f0062572900534553ffb9, requested by: louiswicker for PR: https://github.com/wrf-model/WRF/pull/1373. For any query please send e-mail to David Gill.

    Test Type              | Expected  | Received |  Failed
    = = = = = = = = = = = = = = = = = = = = = = = =  = = = =
    Number of Tests        : 19           18
    Number of Builds       : 48           46
    Number of Simulations  : 163           161        0
    Number of Comparisons  : 103           102        0

    Failed Simulations are: 
    None
    Which comparisons are not bit-for-bit: 
    None
  1. When the option is turned off, there is < 2% timing impact (from figs above in comments). If there is an interest in removing the ENTIRETY of big_step_utilities, the time to do that is now.

I am OK with this PR, as-is.

@Plantain
Copy link

Plantain commented Mar 22, 2021 via email

@davegill
Copy link
Contributor

davegill commented Mar 22, 2021

@Plantain @louiswicker @weiwangncar
Wei is going to put in the "tile-sized" mods for the calls to the tri-diagonal solver.

I removed the u/v part of the CFL print (uncommented, so it is still easily available). Here's what it looks like:

d01 2017-05-01_00:56:00  cfl:            2  points exceeded W_CRITICAL_CFL in domain d01 at time 2017-05-01_00:56:00 hours
d01 2017-05-01_00:56:00 Max   W: (i,j,k)=  254   64   20  W:   14.97  W-CFL:    1.26  dETA:    0.03

Note that the lower case string "cfl" is now part of the output, for backward compatibility searching.

@davegill
Copy link
Contributor

@weiwangncar @dudhia @louiswicker
Folks,
Just a quick status update on a requested sanity check:
Yes, the results with optimized code for a 421x421 case are identical with ieva off, comparing before Lou's mods and with Lou's mods. This is a 3-km case, and I used 30s dt. For a 1-hr simulation (120 model steps), the results are identical.

@davegill davegill self-requested a review March 22, 2021 21:31
@davegill davegill merged commit 4412521 into wrf-model:develop Mar 22, 2021
@weiwangncar
Copy link
Collaborator

@louiswicker You probably noticed that Dave has merged your code onto develop branch. This is mostly based on improved performance when IEVA code is not used. When IEVA is used, we know there is at least a problem with OMP-compiled code. When compiled without optimization (configure -d), an OMP run should produce identical results with MPI and serial runs. Currently MPI and serial code do produce same results.

I added some prints into several routines (WW_SPLIT, CALC_NEW_MUT, ADVECT_[u,v,ph,w]_IMPLICIT), and these are what I find out (I used Ming's 2017050100 case, reduced the domain size to 161x161x60 (but with increased vertical levels in order to get some CFLs), and a restart run at hour 1):

  1. The first time WW_SPLIT is called, the results of wwI and wwE are identical from OMP and serial runs.
  2. I checked one (i,j) point in the new advect routines, and the tendencies appear to be the same.
  3. But by the time WW_SPLIT is called from rk_scalar_tend, the vertical motion as well as MUT become different between the OMP and serial runs. I haven't found out where it diverged.
  4. The differences happen at the boundary of OMP decomposition.

@davegill
Copy link
Contributor

@louiswicker @weiwangncar
Wei,
There are locations in module_ieva_em.F that have loop indices such as:

! Loop limits:  need to compute wwE & wwI one halo ring out for staggered variables.

   ktf     = kte
   i_start = its-1
   i_end   = MIN(ite,ide-1)+1
   j_start = jts-1
   j_end   = MIN(jte,jde-1)+1

For serial and for MPI, the its-1 is not an issue as this putting something in a halo region that is not used again (MPI), or this is a throw-away point (for serial). For OpenMP, this loop would be overwriting a previous tile's results. Of course, same with j_start.

@weiwangncar
Copy link
Collaborator

@davegill When I print, I do get two sets of prints, from both tiles: one set is wrong, the other is correct, relative to the serial run. So I'm not sure which one went to the next step.
This kind of definition can be seen from other subroutines (e.g. module_advect_em.F), but it is possible the arrays the model uses there are state variables. That's what it is confusing here: some of the arrays used in IEVA are local ones.

@weiwangncar
Copy link
Collaborator

@davegill I changed that indices in two subroutines in module_ieva_em.F: ww_split and calc_mut_new. The results are still wrong. It is always the indices...

@louiswicker
Copy link
Contributor Author

Wei & Dave,

thanks for still working on this a bit. I am not sure what else to say.

I think the best way to deal with this is for me to move the splitting of ww out of the module_em.F, and up to the solver level and then do a halo communication. What do you think about that strategy? I won't be able to get back to this for a while, but when I get a chance, I will download the develop branch and work from there.

I am sorry we cannot get this to work!

Lou

@louiswicker
Copy link
Contributor Author

The code runs perfectly for the warm bubble idealized test for n=1 vs n=4 processors. recompiling for full domain tests tonight with opt to check stability. I think we are (dare I say it) there!

Thanks to all of you!

davegill added a commit that referenced this pull request Mar 27, 2021
TYPE: bug fix

KEYWORDS: IEVA, OpenMP

SOURCE: internal

DESCRIPTION OF CHANGES:
Problem:
With IEVA activated, differences appear between OpenMP results with OMP_NUM_THREADS > 1 and any of the following:
1. serial results
2. MPI results
3. OpenMP results with a single thread

Solution:
Working backwards, the computation in WRF (pre-IEVA code) computed the full MU field only on the mass-point tile size: 
```
DO j = jts, jte-1
DO i = its, ite-1
```
We extend the computation one grid cell to the left and right:
```
DO j = jts-1, jte-1
DO i = its-1, ite-1
```
Since WRF previously did not use those values, that is not a problem to have additional rows and columns of valid data inside of the halo region.

This is a follow-on PR to 4412521 #1373 "Implicit Explicit Vertical Advection (IEVA)".

LIST OF MODIFIED FILES: 
M dyn_em/module_big_step_utilities_em.F

TESTS CONDUCTED: 
1. I used a simple Jan 2000 case, with 60 levels, 30-km resolution, and a 20*dx time step. This caused calls to `advect_u_implicit` and `advect_v_implicit` in the first time step. Without the mods, the code generated different results depending on the number of OpenMP threads. With the mods, the results are bit-for-bit for OpenMP with the standard y-only decomposition and with a manual x-only decomposition. 

Below is a figure of the differences of the V field after the first time step (before the modification). This plot is the difference of the same executable using two different OMP_NUM_THREADS values. After the mod, the results are bit-for-bit.
<img width="1152" alt="Screen Shot 2021-03-25 at 4 05 09 PM" src="https://user-images.githubusercontent.com/12666234/112549911-291e8e80-8d84-11eb-8b03-1e1ea50ef731.png">

Before the mods, during the first time step, the following diffs were apparent along the OpenMP boundaries:
```
Diffing np=1/wrfout_d01_2000-01-24_12:10:00 np=6/wrfout_d01_2000-01-24_12:10:00
 Next Time 2000-01-24_12:10:00
     Field   Ndifs    Dims       RMS (1)            RMS (2)     DIGITS    RMSE     pntwise max
         U     49384    3   0.2158843112E+02   0.2158843113E+02   9   0.5344E-05   0.3589E-05
         V     61738    3   0.1834835473E+02   0.1834835712E+02   6   0.1045E-03   0.2183E-03
         W    139132    3   0.4977466348E-01   0.4977466098E-01   7   0.3382E-05   0.4809E-03
        PH     66955    3   0.2327166773E+04   0.2327166753E+04   8   0.1078E-02   0.7572E-05
         T      4838    3   0.7925254902E+02   0.7925254902E+02  12   0.9349E-05   0.2484E-05
       THM      4812    3   0.7921679023E+02   0.7921679023E+02  12   0.9289E-05   0.2484E-05
        MU      1286    2   0.1460135950E+04   0.1460135956E+04   8   0.1203E-02   0.5148E-05
         P      6737    3   0.6512715435E+03   0.6512716390E+03   6   0.2086E-01   0.8162E-03
    QVAPOR     26582    3   0.2913825518E-02   0.2913825518E-02   9   0.4536E-09   0.5671E-05
    QCLOUD       429    3   0.6474288021E-05   0.6474289263E-05   6   0.3257E-09   0.3024E-03
      QICE       715    3   0.4136477606E-05   0.4136463263E-05   5   0.1303E-09   0.1757E-03
     QNICE       676    3   0.4164261806E+06   0.4164261805E+06   9   0.1341E+00   0.1125E-05
    RAINNC        94    2   0.3158246772E-02   0.3158239178E-02   5   0.9447E-07   0.1558E-03
    SNOWNC        94    2   0.3158246772E-02   0.3158239178E-02   5   0.9447E-07   0.1558E-03
        SR         1    2   0.3353836226E+00   0.3353836226E+00   9   0.9006E-09   0.5960E-07
```

2. Wei successfully tested a separate case with 1x16 and 16x1 OpenMP decompositions, where there were bit-for-bit diffs without the mods.
3. Jenkins tests are all PASS.
davegill added a commit that referenced this pull request Mar 29, 2021
TYPE: bug fix

KEYWORDS: IEVA, TLADJ, solve

SOURCE: internal

DESCRIPTION OF CHANGES:
Problem:
After the IEVA mods (commit 4412521, #1373 "Implicit Explicit Vertical Advection (IEVA)"), which changed the calls to 
rk_tendency and rk_scalar_tend, the WRFPlus code no longer compiled.

Solution:
The new arguments added to the calls to rk_tendency and rk_scalar_tend have been added inside the solve routine
for WRFPlus.

LIST OF MODIFIED FILES:
wrftladj/solve_em_ad.F

TESTS CONDUCTED: 
1. Without mods, there are compiler errors from missing args. After the mods:
```
> ls -ls main/*.exe
94944 -rwxr-xr-x 1 gill p66770001 97217616 Mar 29 10:11 main/wrfplus.exe
```
2. The WRFDA regtest is OK.
3. Jenkins is all PASS.
davegill added a commit that referenced this pull request May 10, 2021
TYPE: bug fix

KEYWORDS: IEVA, cfl

SOURCE: internal

DESCRIPTION OF CHANGES:
This is a clean-up PR to 4412521 #1373 "Implicit Explicit Vertical Advection (IEVA)". We are resetting the 
default critical value to activate the w_damping option to the previous setting.

Problem:
The new namelist option `w_crit_cfl` replaces `w_beta`, where `w_beta` used to be set in module_model_constants.F and had a value of 1.0. Before this PR, the default value of `w_crit_cfl` was set to 1.2 in the Registry. If one didn't use the new namelist option to manually reset the value of `w_crit_cfl` to 1., that meant that w_damping would behave 
differently from previous releases.

Solution:
1. With consultation with the developer, the value for `w_crit_cfl` is now set to 1.0 in the Registry file. This gives similar and expected behavior for when w_damping kicks in.
2. Also, a bit of column aligning in the neighborhood of this change was done to make the Registry a bit more tidy. "Try and leave this world a little better than you found it.", Robert Stephenson Smyth Baden-Powell.

LIST OF MODIFIED FILES: 
modified:   Registry/Registry.EM_COMMON

TESTS CONDUCTED: 
1. There are no problems to test, just resetting the critical value for activating w_damping (from 1.2 to 1.0).
2. Let us all hope that Jenkins tests are all PASS.
vlakshmanan-scala pushed a commit to scala-computing/WRF that referenced this pull request Apr 4, 2024
TYPE: new feature

KEYWORDS: IEVA, vertical advection

SOURCE: Louis Wicker (NOAA/NSSL)

DESCRIPTION OF CHANGES:

For grids with large aspect ratios (dx/dz >> 1) that permit explicit convection, the large time step is limited by the 
strongest updraft that occurs during integration.  This results in time step often 20-30% smaller, or requires the use 
of w-filtering, such as latent-heat tendency limiting.  Regions of large vertical velocities are also often very small 
relative to the domain.  The Implicit-Explicit Vertical Advection (IEVA) scheme has been implemented (see Wicker, L. J., 
and W. C. Skamarock, 2020: An Implicit–Explicit Vertical Transport Scheme for Convection-Allowing Models. Mon. Wea. 
Rev., 148, 3893–3910) and that permits a larger time step by partitioning the vertical transport into an explicit piece, 
which uses the normal vertical schemes present in WRF, and a implicit piece which uses implicit transport (which is 
unconditionally stable).  The combined scheme permits a larger time step than has been previously been used and 
reduced w-filtering.
 
The scheme will be useful for CONUS scale CAM (convection allowing model) simulations (dx ~ 2-3 km) when the 
number of vertical levels > 50.  Time steps can increase to as large as 25 s, depending on the problem.  The Wicker 
and Skamarock paper demonstrates IEVA's advantages on the 27 April 2011 Alabama tornado outbreak by comparing 
it to the operational CAM (the High Resolution Rapid Refresh) configuration.  Results are shown that the HRRR 
simulation is stable up to a dt=20 s and only with latent-heat limiting on.  Using the IEVA scheme the dt can be increased 
to 24 s and no latent-heat limiting is needed.  Overall integration efficiency increases ~ 15%, and the IEVA solutions 
are closer to a benchmark run using smaller dt (12 s) than the HRRR simulation.

LIST OF MODIFIED FILES: 
M       Registry/Registry.EM_COMMON
M       dyn_em/Makefile
M       dyn_em/module_advect_em.F
M       dyn_em/module_big_step_utilities_em.F
M       dyn_em/module_em.F
A       dyn_em/module_ieva_em.F
M       dyn_em/solve_em.F
M       run/README.namelist
A       test/em_real/namelist.input.IEVA.4km 

TESTS CONDUCTED:
1. This pull requested code has been tested repeatedly on 27 April for a 24 hour simulation with the parameters set as in the Wicker and Skamarock (2020) as well as a 10 step difference test with the em_quarter_ss case using a single node and 4 nodes.  The differences in the results ~ 10^-5.

2. Jenkins tests are all pass.

RELEASE NOTES: The Implicit-Explicit Vertical Advection (IEVA) scheme has been implemented (see Wicker, L. J., and W. C. Skamarock, 2020: An Implicit–Explicit Vertical Transport Scheme for Convection-Allowing Models. Mon. Wea. Rev., 148, 3893–3910) and that permits a larger time step by partitioning the vertical transport into an explicit piece, which uses the normal vertical schemes present in WRF, and a implicit piece which uses implicit transport (which is unconditionally stable).  The combined scheme permits a larger time step than has been previously been used and reduced w-filtering. The scheme will be useful for CONUS-scale CAM (convection allowing model) simulations (dx ~ 2-3 km) when the number of vertical levels > 50.  In these cases, time steps can increase to as large as 25 s, depending on the problem. Overall integration efficiency increases ~ 15%, and the IEVA solutions are closer to a benchmark run using smaller time step.
vlakshmanan-scala pushed a commit to scala-computing/WRF that referenced this pull request Apr 4, 2024
TYPE: bug fix

KEYWORDS: IEVA, OpenMP

SOURCE: internal

DESCRIPTION OF CHANGES:
Problem:
With IEVA activated, differences appear between OpenMP results with OMP_NUM_THREADS > 1 and any of the following:
1. serial results
2. MPI results
3. OpenMP results with a single thread

Solution:
Working backwards, the computation in WRF (pre-IEVA code) computed the full MU field only on the mass-point tile size: 
```
DO j = jts, jte-1
DO i = its, ite-1
```
We extend the computation one grid cell to the left and right:
```
DO j = jts-1, jte-1
DO i = its-1, ite-1
```
Since WRF previously did not use those values, that is not a problem to have additional rows and columns of valid data inside of the halo region.

This is a follow-on PR to 4412521 wrf-model#1373 "Implicit Explicit Vertical Advection (IEVA)".

LIST OF MODIFIED FILES: 
M dyn_em/module_big_step_utilities_em.F

TESTS CONDUCTED: 
1. I used a simple Jan 2000 case, with 60 levels, 30-km resolution, and a 20*dx time step. This caused calls to `advect_u_implicit` and `advect_v_implicit` in the first time step. Without the mods, the code generated different results depending on the number of OpenMP threads. With the mods, the results are bit-for-bit for OpenMP with the standard y-only decomposition and with a manual x-only decomposition. 

Below is a figure of the differences of the V field after the first time step (before the modification). This plot is the difference of the same executable using two different OMP_NUM_THREADS values. After the mod, the results are bit-for-bit.
<img width="1152" alt="Screen Shot 2021-03-25 at 4 05 09 PM" src="https://user-images.githubusercontent.com/12666234/112549911-291e8e80-8d84-11eb-8b03-1e1ea50ef731.png">

Before the mods, during the first time step, the following diffs were apparent along the OpenMP boundaries:
```
Diffing np=1/wrfout_d01_2000-01-24_12:10:00 np=6/wrfout_d01_2000-01-24_12:10:00
 Next Time 2000-01-24_12:10:00
     Field   Ndifs    Dims       RMS (1)            RMS (2)     DIGITS    RMSE     pntwise max
         U     49384    3   0.2158843112E+02   0.2158843113E+02   9   0.5344E-05   0.3589E-05
         V     61738    3   0.1834835473E+02   0.1834835712E+02   6   0.1045E-03   0.2183E-03
         W    139132    3   0.4977466348E-01   0.4977466098E-01   7   0.3382E-05   0.4809E-03
        PH     66955    3   0.2327166773E+04   0.2327166753E+04   8   0.1078E-02   0.7572E-05
         T      4838    3   0.7925254902E+02   0.7925254902E+02  12   0.9349E-05   0.2484E-05
       THM      4812    3   0.7921679023E+02   0.7921679023E+02  12   0.9289E-05   0.2484E-05
        MU      1286    2   0.1460135950E+04   0.1460135956E+04   8   0.1203E-02   0.5148E-05
         P      6737    3   0.6512715435E+03   0.6512716390E+03   6   0.2086E-01   0.8162E-03
    QVAPOR     26582    3   0.2913825518E-02   0.2913825518E-02   9   0.4536E-09   0.5671E-05
    QCLOUD       429    3   0.6474288021E-05   0.6474289263E-05   6   0.3257E-09   0.3024E-03
      QICE       715    3   0.4136477606E-05   0.4136463263E-05   5   0.1303E-09   0.1757E-03
     QNICE       676    3   0.4164261806E+06   0.4164261805E+06   9   0.1341E+00   0.1125E-05
    RAINNC        94    2   0.3158246772E-02   0.3158239178E-02   5   0.9447E-07   0.1558E-03
    SNOWNC        94    2   0.3158246772E-02   0.3158239178E-02   5   0.9447E-07   0.1558E-03
        SR         1    2   0.3353836226E+00   0.3353836226E+00   9   0.9006E-09   0.5960E-07
```

2. Wei successfully tested a separate case with 1x16 and 16x1 OpenMP decompositions, where there were bit-for-bit diffs without the mods.
3. Jenkins tests are all PASS.
vlakshmanan-scala pushed a commit to scala-computing/WRF that referenced this pull request Apr 4, 2024
TYPE: bug fix

KEYWORDS: IEVA, TLADJ, solve

SOURCE: internal

DESCRIPTION OF CHANGES:
Problem:
After the IEVA mods (commit 4412521, wrf-model#1373 "Implicit Explicit Vertical Advection (IEVA)"), which changed the calls to 
rk_tendency and rk_scalar_tend, the WRFPlus code no longer compiled.

Solution:
The new arguments added to the calls to rk_tendency and rk_scalar_tend have been added inside the solve routine
for WRFPlus.

LIST OF MODIFIED FILES:
wrftladj/solve_em_ad.F

TESTS CONDUCTED: 
1. Without mods, there are compiler errors from missing args. After the mods:
```
> ls -ls main/*.exe
94944 -rwxr-xr-x 1 gill p66770001 97217616 Mar 29 10:11 main/wrfplus.exe
```
2. The WRFDA regtest is OK.
3. Jenkins is all PASS.
vlakshmanan-scala pushed a commit to scala-computing/WRF that referenced this pull request Apr 4, 2024
TYPE: bug fix

KEYWORDS: IEVA, cfl

SOURCE: internal

DESCRIPTION OF CHANGES:
This is a clean-up PR to 4412521 wrf-model#1373 "Implicit Explicit Vertical Advection (IEVA)". We are resetting the 
default critical value to activate the w_damping option to the previous setting.

Problem:
The new namelist option `w_crit_cfl` replaces `w_beta`, where `w_beta` used to be set in module_model_constants.F and had a value of 1.0. Before this PR, the default value of `w_crit_cfl` was set to 1.2 in the Registry. If one didn't use the new namelist option to manually reset the value of `w_crit_cfl` to 1., that meant that w_damping would behave 
differently from previous releases.

Solution:
1. With consultation with the developer, the value for `w_crit_cfl` is now set to 1.0 in the Registry file. This gives similar and expected behavior for when w_damping kicks in.
2. Also, a bit of column aligning in the neighborhood of this change was done to make the Registry a bit more tidy. "Try and leave this world a little better than you found it.", Robert Stephenson Smyth Baden-Powell.

LIST OF MODIFIED FILES: 
modified:   Registry/Registry.EM_COMMON

TESTS CONDUCTED: 
1. There are no problems to test, just resetting the critical value for activating w_damping (from 1.2 to 1.0).
2. Let us all hope that Jenkins tests are all PASS.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants