Skip to content

Commit

Permalink
doc: add tuning parameters for MPI_THREAD_MULTIPLE
Browse files Browse the repository at this point in the history
These are the currently supported tuning parameters for
MPI_THREAD_MULTIPLE.
  • Loading branch information
sagarth committed Apr 1, 2022
1 parent 81a1db8 commit 58f272b
Showing 1 changed file with 100 additions and 0 deletions.
100 changes: 100 additions & 0 deletions doc/mpich/TUNING_PARAMETERS
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,103 @@ Notice that performance of the application depends on how ranks are mapped to
the cores of the processor. For information on how to specify process-core
binding with mpirun, information is provided here:
https://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager#Process-core_Binding.

This following section of this document provides information about threading
optimizations (Section 1).

**1. Multi-threading**

MPI provides four different threading modes: MPI_THREAD_SINGLE,
MPI_THREAD_FUNNELED, MPI_THREAD_SERIALIZED, and MPI_THREAD_MULTIPLE.
MPI_THREAD_MULTIPLE is the most advanced threading mode and the focus of this
Section. With this mode, an application runs with several threads, and any
thread can make calls to MPI at any time. Thus, the MPI library needs to protect
access to its internal data structures appropriately. It also tries to exploit
concurrency by taking advantage of the availability of multiple hardware
contexts in the network card, which we call VCIs (Virtual Communication
Interfaces).

MPICH provides two threading synchronization strategies. One strategy is to use
a `global` coarse-grained lock to protect every MPI call that accesses shared
state inside the MPI runtime. With this model, if two threads call concurrently
the MPI runtime, the runtime will serialize the calls. Another strategy uses a
fine-grained lock to protect the access to each network context or VCI. With
this approach, threads trying to access the same VCI get serialized, but threads
accessing different VCIs can execute in parallel. With the fine-grained locking
model, MPICH supports a `direct` mode where the MPI runtime itself grabs a per
VCI lock and a `lockless` mode where MPI does not grab a lock and relies on the
network provider to provide fine grain synchronization (MPICH uses the
FI_THREAD_SAFE mode from OFI to support the `lockless` mode).

A threading synchronization strategy can be selected at the configuration time
as:
--enable-thread-cs=global
to select coarse-grained locking, and
--enable-thread-cs=per-vci
to select fine-grained locking.

Under fine-grained locking, a specific locking mode at configuration time as:
--enable-ch4-mt=<mode>
mode can be one of direct, lockless and runtime.

If configuration is done with --enable-ch4-mt=runtime, user can select a
threading mode at runtime using the environment MPIR_CVAR_CH4_MT_MODEL
(described below in Section 1.1).

**1.1 CVARs (Environment Variables)**

MPICH users can use the CVAR MPIR_CVAR_CH4_MT_MODEL to select at runtime among
the threading synchronization models, `global`, `direct`, or `lockless`. If the
CVAR is not set, the default model is `direct`.

e.g., export MPIR_CVAR_CH4_MT_MODEL=direct

Users can specify the number of VCIs to use by setting the environment variable
`MPIR_CVAR_CH4_NUM_VCIS=<number_vcis>`, where the value `<num_vcis>` should be
less or equal than the number of network contexts provided by the OFI provider.
This number can be retrieved from the OFI provider. Running the `fi_info -v`
command should return the number of transmit contexts (max_ep_tx_ctx) and
receive contexts (max_ep_rx_ctx) it supports. MPICH uses the minimum of
max_ep_tx_ctx and max_ep_rx_ctx as the maximum number of VCIs it can support. If
the system has multiple NICS, this value is the `minimum across all the NICS`.

e.g., export MPIR_CVAR_CH4_NUM_VCIS=16

When multiple VCIs are used, MPICH tries to make per-vci progress to avoid lock
contention inside the progress engine. However, it also progresses all VCIs,
i.e., make global progress, once in a while to avoid potential deadlocks. Such
global progress is turned on by default but can be turned off by setting
`MPIR_CVAR_CH4_GLOBAL_PROGRESS=0`, which can improve performance under the use
of multiple VCIs. However, caution should be taken as it can result in a deadlock.

**1.2 Communicator Info Hints**

When selecting the fine-grain modes (`direct` or `lockless`), messages sent
using different communicators can be sent/received using different VCIs.
However, messages using the same communicator may have to be sent/received
through the same VCI, to guarantee the ordering semantics imposed by the MPI
standard. However, messages sent through the same communicator can be
sent/received through different VCIs if the MPI runtime knows that wildcards
will not be used on that communicator. Thus, MPICH implements the MPI 4.0
per-communicator info hints and recommend users to use them to indicate the
MPI runtime when wildcards are not used, specially when using `direct` or
`lockless` threading modes with multiple VCIs if the threads use the same
communicator.
Here are the standard hints from the MPI 4.0 standard that are supported
(we recommend using both info hints if the application is not using wildcards):

* `mpi_assert_no_any_source`: Set this info hint if the application does
not use the wildcard MPI_ANY_SOURCE on the given communicator.
* `mpi_assert_no_any_tag`: Set this hint if the application does not use
MPI_ANY_TAG on the given communciator.

MPICH also supports a non-standard per-communicator info hint that provides the
user with more control of a communicator-VCI mapping:
* `vci`: Specific sender and receiver VCI can be selected for each communicator
using this hint. Applications can utilize this feature to control how
messages are spread across the VCIs.
For instance, programmers can write the applications so that pairs of threads
communicate by using different communicator, where each communicator can be
mapped to a different VCI through the `vci` info hint. Valid values for this
info hints are 0 to (total vcis - 1). This hint is also utilized by MPICH to
select a progress thread to process reduction collective operation.

0 comments on commit 58f272b

Please sign in to comment.