Skip to content

Commit

Permalink
Merge pull request #708 from oahull0112/gh-pages
Browse files Browse the repository at this point in the history
Add description of stall library to Kestrel performance recs
  • Loading branch information
yandthj authored Nov 15, 2024
2 parents d3456b6 + 42ef163 commit bba6c92
Showing 1 changed file with 15 additions and 1 deletion.
16 changes: 15 additions & 1 deletion docs/Documentation/Systems/Kestrel/Running/performancerecs.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,18 @@ export MPICH_COLL_OPT_OFF=mpi_allreduce
These environment variables turn off some collective optimizations that we have seen can cause slowdowns. For more information on these environment variables, visit HPE's documentation site [here](https://cpe.ext.hpe.com/docs/mpt/mpich/intro_mpi_ucx.html).

4. For hybrid MPI/OpenMP codes, requesting more threads per task than you tend to request on Eagle. This may yield performance improvements.


### MPI Stall Library
For calculations requesting more than ~10 nodes, you can use the cray mpich stall library. This library can help reduce slowdowns in your calculation runtime caused by congestion in MPI communication, a possible performance bottleneck on Kestrel for calculations using ~10 nodes or more. To use the library, you must first make sure your code has been compiled within one of the `PrgEnv-gnu`, `PrgEnv-cray`, or `PrgEnv-intel` programming environments. Then, add the following lines to your sbatch submit script:
```
stall_path=/nopt/nrel/apps/cray-mpich-stall
export LD_LIBRARY_PATH=$stall_path/libs_mpich_nrel_{PRGENV-NAME}:$LD_LIBRARY_PATH
export MPICH_OFI_CQ_STALL=1
```
Where {PRGENV-NAME} is replaced with one of `cray`, `intel`, or `gnu`. For example, if you compiled your code within the default `PrgEnv-gnu` environment, then you would export the following lines:
```
stall_path=/nopt/nrel/apps/cray-mpich-stall
export LD_LIBRARY_PATH=$stall_path/libs_mpich_nrel_gnu:$LD_LIBRARY_PATH
export MPICH_OFI_CQ_STALL=1
```
The default "stall" of the MPI tasks is 12 microseconds, which we recommend trying before manually adjusting the stall time. You can adjust the stall to be longer or shorter with `export MPICH_OFI_CQ_STALL_USECS=[time in microseconds]` e.g. for 6 microseconds, `export MPICH_OFI_CQ_STALL_USECS=6`. A stall time of 0 would be the same as "regular" MPI. As stall time increases, the amount of congestion decreases, up to a calculation-dependent "optimal" stall time. If you need assistance in using this stall library, please email [email protected].

0 comments on commit bba6c92

Please sign in to comment.