You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are looking for guidance on optimizing run-time speed on WCOSS2 for a couple of different node count configurations for a large regional domain. One is for a small node count (~13 nodes) used in making 1 hour forecasts within the EnKF system (and writing a restart file at the 1 h time). The other is for a large node count (~110 nodes) used in making 60 hour forecasts with hourly history output (and eventually 15 minute history output for the first ~18 hours), and (eventually) 6 h restart writes.
Input files and a run script has been collected under /lfs/h2/emc/lam/noscrub/Matthew.Pyle/rrfs_optimization_update/ on Cactus and Dogwood
The job_card.sh_6hfore_fullnode script currently is set up to run a 6 h forecast using a 61 node configuration, but can switch to 76, 97, or 110 node configurations easily. If the initialization time could be reduced (it is several minutes for large node counts), and the integration speed could be improved, it would be a great help.
The job_card.sh_1hfore_fullnode script runs a 1 h forecast, writing out a restart file at the end. This general configuration is for the EnKF system, which will either be 30 members run concurrently (each member using 12-15 nodes to run) or in two batches of 15 members (each running on 25-30 nodes). In either scenario, would ideally have all of these forecasts complete within ~10 minutes.
The text was updated successfully, but these errors were encountered:
How much I/O does each member do, particularly input. What are the file names and how large are they and how fast to they need to get read? There is evidence the WCOSS2 filesystem itself is being crushed by this input when ensembles are started and if so, the I/O needs redesign. I am late to this investigation and am aware of FMS work to mitigate it
We are looking for guidance on optimizing run-time speed on WCOSS2 for a couple of different node count configurations for a large regional domain. One is for a small node count (~13 nodes) used in making 1 hour forecasts within the EnKF system (and writing a restart file at the 1 h time). The other is for a large node count (~110 nodes) used in making 60 hour forecasts with hourly history output (and eventually 15 minute history output for the first ~18 hours), and (eventually) 6 h restart writes.
Input files and a run script has been collected under /lfs/h2/emc/lam/noscrub/Matthew.Pyle/rrfs_optimization_update/ on Cactus and Dogwood
The job_card.sh_6hfore_fullnode script currently is set up to run a 6 h forecast using a 61 node configuration, but can switch to 76, 97, or 110 node configurations easily. If the initialization time could be reduced (it is several minutes for large node counts), and the integration speed could be improved, it would be a great help.
The job_card.sh_1hfore_fullnode script runs a 1 h forecast, writing out a restart file at the end. This general configuration is for the EnKF system, which will either be 30 members run concurrently (each member using 12-15 nodes to run) or in two batches of 15 members (each running on 25-30 nodes). In either scenario, would ideally have all of these forecasts complete within ~10 minutes.
The text was updated successfully, but these errors were encountered: