Skip to content

Commit

Permalink
Merge branch 'master' into fix_data_array_destruct
Browse files Browse the repository at this point in the history
  • Loading branch information
jjhursey authored Nov 3, 2021
2 parents 64aa1ae + b19233d commit fac7970
Show file tree
Hide file tree
Showing 5 changed files with 156 additions and 35 deletions.
57 changes: 57 additions & 0 deletions App_Use_Cases.tex
Original file line number Diff line number Diff line change
Expand Up @@ -448,4 +448,61 @@ \subsubsection{Coordinating at Runtime with Multiple Event Handlers}
\refconst{PMIX_EVENT_PARTIAL_ACTION_TAKEN} \\
\refconst{PMIX_EVENT_ACTION_DEFERRED} \\

\section{MPI Sessions}
\label{app:uc-MPI-sessions}

\subsection{Use Case Summary}
MPI Sessions addresses a number of the limitations of the current MPI programming model. Among the immediate problems MPI Sessions is intended to address are the following:

\begin{itemize}
\item MPI cannot be initialized within an MPI process from different application components without a priori knowledge or coordination,
\item MPI cannot be initialized more than once, and MPI cannot be reinitialized after MPI finalize has been called.
\item With MPI Sessions, an application no longer needs to explicitly call \code{MPI_Init} to make use of MPI, but rather can use a Session to only initialize MPI resources for specific communication needs.
\item Unless the MPI process explicitly calls MPI_Init, there is also no explicit \code{MPI_COMM_WORLD} communicator. Sessions can be created and destroyed multiple times in an MPI process.
\end{itemize}

\subsection{Use Case Details}

\begingroup
\begin{figure*}
\begin{center}
\includegraphics[width=.5\textwidth,keepaspectratio]{figs/mpi-sessions1}
\end{center}
\caption{MPI Communicator from MPI Session Handle using PMIx}
\label{fig:mpi_s1}
\end{figure*}
\endgroup

A PMIx Process Set (PSET) is a user-provided or host environment assigned
label associated with a given set of application processes. Processes can
belong to multiple process sets at a time. Definition of a PMIx
process set typically occurs at time of application execution - e.g., on a
command line: \code{prun -n 4 --pset ocean myoceanapp : -n 3 --pset ice myiceapp}

PMIx PSETs are used for query functions (\code{MPI_SESSION_GET_NUM_PSETS}, \code{MPI_SESSION_GET_NTH_PSET)} and to create \code{MPI_GROUP} from a process set name.

In OpenMPI's MPI Sessions prototype, PMIx groups are used during creation of \code{MPI_COMM} from an \code{MPI_GROUP}. The PMIx group constructor returns a 64-bit PMIx Group Context Identifier (PGCID) that is guaranteed to be unique for the duration of an allocation (in the case of a batch managed environment). This PGCID could be used as a direct replacement for the existing unique identifiers for communicators in MPI (E.g. Communicator Identifiers (CIDs) in Open MPI), but may have performance implications.

There is an important distinction between process sets and process groups. The process set identifiers are set by the host environment and currently there are no PMIx APIs provided by which an application can change a process set membership. In contrast, PMIx process groups can only be defined dynamically by the application.

\littleheader{Related Interfaces}

{\large \refapi{PMIx_Get}}
\pasteSignature{PMIx_Get}

{\large \refapi{PMIx_Group_construct}}
\pasteSignature{PMIx_Group_construct}

\littleheader{Related Attributes}

\pasteAttributeItem{PMIX_PSET_NAMES}
\pasteAttributeItem{PMIX_QUERY_NUM_GROUPS}
\pasteAttributeItem{PMIX_QUERY_GROUP_NAMES}
\pasteAttributeItem{PMIX_QUERY_GROUP_MEMBERSHIP}

\littleheader{Related Constants}

\refconst{PMIX_SUCCESS}
\refconst{PMIX_ERR_NOT_SUPPORTED}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
33 changes: 33 additions & 0 deletions Chap_Introduction.tex
Original file line number Diff line number Diff line change
Expand Up @@ -200,3 +200,36 @@ \subsection{Attributes in PMIx}
}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{PMIx Roles}

The role of a \ac{PMIx} process in the \ac{PMIx} universe is grouped into one of three categories based on how it operates in the \ac{PMIx} environment namely as a \emph{client}, \emph{server}, or \emph{tool}.
As a result, there are three corresponding groupings of \acp{API} each with their own initialization and finalization functions.
If a process initializes as either a \emph{server} or a \emph{tool} that process may also access all of the \emph{client} \acp{API}.

A process operating as a \refterm{client} is connected to the \ac{PMIx} server instance within an \ac{RM} when the client calls the client \ac{PMIx} initialization routine.
The \refterm{client} is typically started directly or indirectly (for example, by an intermediate script) by that \ac{RM}.
Additionally, a \refterm{client} may be started directly by the user and then connect to an \ac{RM} which is typically referred to as a \declareterm{singleton} launch.
A process operating as a \declareterm{server} is responsible for starting client processes and coordinating with other server and tool processes in the same \ac{PMIx} universe.
Often processes operating as a \emph{server} are part of the \acf{RM} infrastructure.
A process operating as a \declareterm{tool} is started independently (e.g., via fork/exec) or by the \ac{RM} and will connect to a \ac{PMIx} \emph{server} to interact with the processes in the \ac{PMIx} universe.
An example of a \emph{tool} process is a parallel debugger that will connect to the server to assist with attaching to a set of client processes.

\ac{PMIx} serves as a conduit between processes acting in these three different roles.
As such, an \ac{API} is often described by how it interacts with processes operating in other roles in the \ac{PMIx} universe.

\adviceimplstart
A \ac{PMIx} implementation may support all or a subset of the \ac{API} role groupings defined in the standard.
A common nomenclature is defined here to aid in identifying levels of conformance of an implementation.

Note that it would not make sense for an implementation to exclude the \emph{client} interfaces from their implementation since they are also used by the \emph{server} and \emph{tool} roles.
Therefore the \emph{client} interfaces represent the minimal set of required functionality for \ac{PMIx} compliance.

A \ac{PMIx} implementation that supports only the \emph{client} \acp{API} is said to be \emph{client-role \ac{PMIx} standard compliant}.
Similarly, a \ac{PMIx} implementation that only supports the \emph{client} and \emph{tool} \acp{API} is said to be \emph{client-role and tool-role \ac{PMIx} standard compliant}.
Finally, a \ac{PMIx} implementation that only supports the \emph{client} and \emph{server} \acp{API} is said to be \emph{client-role and server-role \ac{PMIx} standard compliant}.

A \ac{PMIx} implementation that supports all three sets of the \ac{API} role groupings is said to be \emph{client-role, server-role, and tool-role \ac{PMIx} standard compliant}.
These \emph{client-role,server-role, and tool-role \ac{PMIx} standard compliant} implementations have the advantage of being able to support a broad set of \ac{PMIx} consumers in the different roles.
\adviceimplend

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
71 changes: 48 additions & 23 deletions Chap_Revisions.tex
Original file line number Diff line number Diff line change
Expand Up @@ -1199,12 +1199,22 @@ \section{Version 4.1: TBD}
The v4.1 update includes clarifications and corrections from the v4.0 document:

\begin{compactitemize}
\item Remove some stale language in \refsection{chap:api_event:notify}{Chapter 9.1}.
\item Provisional Items:
\begin{compactitemize}
\item Storage \chapterref{chap:api_storage}
\end{compactitemize}
\end{compactitemize}

\subsection{Added Functions (Provisional)}

\begin{compactitemize}
\item \refapi{PMIx_Data_load}
\item \refapi{PMIx_Data_unload}
\item \refapi{PMIx_Data_compress}
\item \refapi{PMIx_Data_decompress}
\end{compactitemize}

\subsection{Added Data Structures (Provisional)}

\begin{compactitemize}
Expand All @@ -1214,33 +1224,48 @@ \subsection{Added Data Structures (Provisional)}
\item \refstruct{pmix_storage_access_type_t}
\end{compactitemize}

\subsection{Added Macros (Provisional)}

\begin{compactitemize}
\item \refmacro{PMIX_NSPACE_INVALID}
\item \refmacro{PMIX_RANK_IS_VALID}
\item \refmacro{PMIX_PROCID_INVALID}
\item \refmacro{PMIX_PROCID_XFER}
\end{compactitemize}

\subsection{Added Constants (Provisional)}

\begin{compactitemize}
\item \refconst{PMIX_PROC_NSPACE}
\end{compactitemize}

\littleheader{Storage constants}
\refconst{PMIX_STORAGE_MEDIUM_UNKNOWN} \\
\refconst{PMIX_STORAGE_MEDIUM_TAPE} \\
\refconst{PMIX_STORAGE_MEDIUM_HDD} \\
\refconst{PMIX_STORAGE_MEDIUM_SSD} \\
\refconst{PMIX_STORAGE_MEDIUM_NVME} \\
\refconst{PMIX_STORAGE_MEDIUM_PMEM} \\
\refconst{PMIX_STORAGE_MEDIUM_RAM} \\
\refconst{PMIX_STORAGE_ACCESSIBILITY_NODE} \\
\refconst{PMIX_STORAGE_ACCESSIBILITY_SESSION} \\
\refconst{PMIX_STORAGE_ACCESSIBILITY_JOB} \\
\refconst{PMIX_STORAGE_ACCESSIBILITY_RACK} \\
\refconst{PMIX_STORAGE_ACCESSIBILITY_CLUSTER} \\
\refconst{PMIX_STORAGE_ACCESSIBILITY_REMOTE} \\
\refconst{PMIX_STORAGE_PERSISTENCE_TEMPORARY} \\
\refconst{PMIX_STORAGE_PERSISTENCE_NODE} \\
\refconst{PMIX_STORAGE_PERSISTENCE_SESSION} \\
\refconst{PMIX_STORAGE_PERSISTENCE_JOB} \\
\refconst{PMIX_STORAGE_PERSISTENCE_SCRATCH} \\
\refconst{PMIX_STORAGE_PERSISTENCE_PROJECT} \\
\refconst{PMIX_STORAGE_PERSISTENCE_ARCHIVE} \\
\refconst{PMIX_STORAGE_ACCESS_RD} \\
\refconst{PMIX_STORAGE_ACCESS_WR} \\
\refconst{PMIX_STORAGE_ACCESS_RDWR}

\begin{compactitemize}
\item \refconst{PMIX_STORAGE_MEDIUM_UNKNOWN}
\item \refconst{PMIX_STORAGE_MEDIUM_TAPE}
\item \refconst{PMIX_STORAGE_MEDIUM_HDD}
\item \refconst{PMIX_STORAGE_MEDIUM_SSD}
\item \refconst{PMIX_STORAGE_MEDIUM_NVME}
\item \refconst{PMIX_STORAGE_MEDIUM_PMEM}
\item \refconst{PMIX_STORAGE_MEDIUM_RAM}
\item \refconst{PMIX_STORAGE_ACCESSIBILITY_NODE}
\item \refconst{PMIX_STORAGE_ACCESSIBILITY_SESSION}
\item \refconst{PMIX_STORAGE_ACCESSIBILITY_JOB}
\item \refconst{PMIX_STORAGE_ACCESSIBILITY_RACK}
\item \refconst{PMIX_STORAGE_ACCESSIBILITY_CLUSTER}
\item \refconst{PMIX_STORAGE_ACCESSIBILITY_REMOTE}
\item \refconst{PMIX_STORAGE_PERSISTENCE_TEMPORARY}
\item \refconst{PMIX_STORAGE_PERSISTENCE_NODE}
\item \refconst{PMIX_STORAGE_PERSISTENCE_SESSION}
\item \refconst{PMIX_STORAGE_PERSISTENCE_JOB}
\item \refconst{PMIX_STORAGE_PERSISTENCE_SCRATCH}
\item \refconst{PMIX_STORAGE_PERSISTENCE_PROJECT}
\item \refconst{PMIX_STORAGE_PERSISTENCE_ARCHIVE}
\item \refconst{PMIX_STORAGE_ACCESS_RD}
\item \refconst{PMIX_STORAGE_ACCESS_WR}
\item \refconst{PMIX_STORAGE_ACCESS_RDWR}
\end{compactitemize}

\subsection{Added Attributes (Provisional)}

Expand Down
30 changes: 18 additions & 12 deletions Chap_Terms.tex
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ \chapter{PMIx Terms and Conventions}
In this chapter we describe some common terms and conventions used throughout
this document. The \ac{PMIx} Standard has adopted the widespread use of
key-value \textit{attributes} to add flexibility to the functionality expressed
in the existing \acp{API}. Accordingly, the \ac{ASC} has chosen to require that
in the \acp{API}. Accordingly, the \ac{ASC} has chosen to require that
the definition of each standard \ac{API} include the passing of an array of
attributes. These provide a means of customizing the behavior of the \ac{API}
as future needs emerge without having to alter or create new variants of it. In
Expand All @@ -20,15 +20,22 @@ \chapter{PMIx Terms and Conventions}
The following terminology is used throughout this document:

\begin{itemize}
\item \declareterm{session} refers to a pool of resources with a unique identifier (a.k.a., the \emph{session ID}) assigned by the \ac{WLM} that has been reserved for one or more users. Historically, \ac{HPC} sessions have consisted of a static allocation of resources - e.g., a block of nodes assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to a potentially dynamic entity, perhaps comprised of resources accumulated as a result of multiple allocation requests that are managed as a single unit by the \ac{WLM}.

\item \declareterm{job} refers to a set of one or more \emph{applications} executed as a single invocation by the user within a session with a unique identifier (a.k.a, the \emph{job ID}) assigned by the \ac{RM} or launcher. For example, the command line ``\textit{mpiexec -n 1 app1 : -n 2 app2}'' generates a single \ac{MPMD} job containing two applications. A user may execute multiple \emph{jobs} within a given session, either sequentially or in parallel.
\item \declareterm{session}
refers to a set of resources assigned by the \ac{WLM} that has been
reserved for one or more users.
A session is identified by a \emph{session ID} that is
unique within the scope of the governing \acp{WLM}.
Historically, \ac{HPC} sessions have consisted of a static allocation of resources - i.e., a block of resources assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to a potentially dynamic entity, perhaps comprised of resources accumulated as a result of multiple allocation requests that are managed as a single unit by the \ac{WLM}.

\item \declareterm{namespace} refers to a character string value assigned by the \ac{RM} or launcher (e.g., \code{mpiexec}) to a \textit{job}. All \textit{applications} executed as part of that \textit{job} share the same \emph{namespace}. The \emph{namespace} assigned to each \emph{job} must be unique within the scope of the governing \ac{RM} and often is implemented as a string representation of a numerical job ID. The \emph{namespace} and \emph{job} terms will be used interchangeably throughout the document.
\item \declareterm{job} refers to a set of one or more \emph{applications} executed as a single invocation by the user within a session with a unique identifier, the \emph{job ID}, assigned by the \ac{RM} or launcher. For example, the command line ``\textit{mpiexec -n 1 app1 : -n 2 app2}'' generates a single \ac{MPMD} job containing two applications. A user may execute multiple \emph{jobs} within a given session, either sequentially or concurrently.

\item \declareterm{application} refers to a single executable (binary, script, etc.) member of a \emph{job}.
\item \declareterm{namespace} refers to a character string value assigned by the \ac{RM} to a \textit{job}. All \textit{applications} executed as part of that \textit{job} share the same \emph{namespace}. The \emph{namespace} assigned to each \emph{job} must be unique within the scope of the governing \ac{RM} and often is implemented as a string representation of the numerical emph{Job ID}. The \emph{namespace} and \emph{job} terms will be used interchangeably throughout the document.

\item \declareterm{process} refers to an operating system process, also commonly referred to as a \emph{heavyweight} process. A process is often comprised of multiple \emph{lightweight threads}, commonly known as simply \declaretermAlt{threads}{thread}.
\item \declareterm{application} represents a set of identical, but not necessarily unique,
execution contexts within a \emph{job}.

\item \declareterm{process} is assumed for ease of presentation to be an operating system process, also commonly referred to as a \emph{heavyweight} process. A process is often comprised of multiple \emph{lightweight threads}, commonly known as simply \declaretermAlt{threads}{thread}. However, it is not the intent of the \ac{PMIx} Standard to restrict the term process to a particular concept or implementation.

\item \declaretermAlt{client}{clients} refers to a process that was registered with the \ac{PMIx} server prior to being started, and connects to that \ac{PMIx} server via \refapi{PMIx_Init} using its assigned namespace and rank with the information required to connect to that server being provided to the process at time of start of execution.

Expand All @@ -38,7 +45,7 @@ \chapter{PMIx Terms and Conventions}

\item \declaretermAlt{peer}{peers} refers to another process within the same \refterm{job}.

\item \declaretermAlt{workflow}{workflows} refers to an orchestrated execution plan frequently involving multiple \emph{jobs} carried out under the control of a \emph{workflow manager} process. An example workflow might first execute a computational job to generate the flow of liquid through a complex cavity, followed by a visualization job that takes the output of the first job as its input to produce an image output.
\item \declaretermAlt{workflow}{workflows} refers to an orchestrated execution plan typically involving multiple \emph{jobs} carried out under the control of a \emph{workflow manager}. An example workflow might first execute a computational job to generate the flow of liquid through a complex cavity, followed by a visualization job that takes the output of the first job as its input to produce an image output.

\item \declareterm{scheduler} refers to the component of the \ac{SMS} responsible for scheduling of resource allocations. This is also generally referred to as the \emph{system workflow manager} - for the purposes of this document, the \emph{WLM} acronym will be used interchangeably to refer to the scheduler.

Expand Down Expand Up @@ -66,7 +73,6 @@ \chapter{PMIx Terms and Conventions}

\end{itemize}


The following sections provide an overview of the conventions used throughout the \ac{PMIx} Standard document.

%%%%%%%%%%%
Expand Down Expand Up @@ -128,8 +134,9 @@ \section{Naming Conventions}
The \ac{PMIx} standard has adopted the following conventions:

\begin{itemize}
\item \ac{PMIx} constants and attributes are prefixed with \textbf{\code{PMIX}}.
\item Structures and type definitions are prefixed with \code{pmix}.
\item \ac{PMIx} constants and attributes are prefixed with \textbf{\code{"PMIX_"}}.
\item Structures and type definitions are prefixed with \code{"pmix_"}.
\item The string representation of attributes are prefixed with \code{"pmix"}.
\item Underscores are used to separate words in a function or variable name.
\item Lowercase letters are used in \ac{PMIx} client \acp{API} except for the \ac{PMIx} prefix (noted below) and the first letter of the word following it.
For example, \refapi{PMIx_Get_version}.
Expand All @@ -138,9 +145,8 @@ \section{Naming Conventions}
\item The \code{pmix_} prefix is used to denote function pointer and type definitions.
\end{itemize}

Users should not use the \textbf{\code{"PMIX"}}, \textbf{\code{"PMIx"}}, or \textbf{\code{"pmix"}} prefixes in their applications or libraries so as to avoid symbol conflicts with current and later versions of the \ac{PMIx} Standard.
Users shall not use the \textbf{\code{"PMIX_"}}, \textbf{\code{"PMIx_"}}, or \textbf{\code{"pmix_"}} prefixes for symbols in their code so as to avoid symbol conflicts with \ac{PMIx} implementations.

%%%%%%%%%%%
\section{Procedure Conventions}

While the current \acp{API} are based on the C programming language, it is not the intent of the \ac{PMIx} Standard to preclude the use of other languages.
Expand Down
Binary file added figs/mpi-sessions1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit fac7970

Please sign in to comment.