Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 175 chapter 5 8 rebase #387

Closed
wants to merge 22 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
54c4d64
more information/clarify around when to PMIx_Get return values have t…
dsolt Nov 9, 2020
e2e261c
Add text that was written prior to V4.0
dsolt Nov 9, 2020
7a44d9d
Major reorg start
dsolt Nov 9, 2020
f6e76ba
Add an introduction
dsolt Dec 17, 2020
e01ea71
Add new chapters
dsolt Feb 1, 2021
5087a34
After working group review and fix up the labels and references
dsolt Feb 2, 2021
02a21cc
Fix up labels
dsolt Mar 19, 2021
5623b54
Apply all changes to Chap_API_Sync_Access.tex that have been made to …
dsolt Mar 19, 2021
e323c59
Updates with new clear distinction between reserved and non-reserved …
dsolt Mar 22, 2021
4e4a269
Fix some labels and another rebase issue
dsolt Mar 23, 2021
22b6aff
Update Reserved keys text in chapter 6
dsolt May 24, 2021
9285349
Get rid of duplicated period and add a TODO note
dsolt Nov 5, 2021
4c57e3d
Add commit to chapter that was renamed and rebase couldn't find
dsolt Nov 5, 2021
4667701
PMIX_LOCAL_PROCS moved back to the node-level attributes since it doe…
dsolt Nov 8, 2021
d0b3f18
First pass at re-organizing query chapter
dsolt Nov 9, 2021
09b9536
Clarify language around retrieval rules for non-reserved keys
jjhursey Apr 5, 2021
c6c9ddb
Update new chapter with return-codes changes
dsolt Dec 10, 2021
9764d17
More feedback on Sync chapter
dsolt Dec 24, 2021
90bc960
explain that some keys are used as keys and attributes. Add PMIX_NSP…
dsolt Dec 24, 2021
15848a5
Remove accidental placeholder
dsolt Jan 14, 2022
2472a1e
spacing
dsolt Jan 24, 2022
418c3e8
Fix up 2 places where we added additional text to present a list resu…
dsolt Jan 24, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions Chap_API_Fabric.tex
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ \chapter{Fabric Support Definitions}

\end{enumerate}

Information on fabric coordinates, endpoints, and device distances are provided as \emph{reserved keys} as detailed in Chapter \ref{chap:api_rsvd_keys} - i.e., they are to be available at client start of execution and are subject to the retrieval rules of Section \ref{chap:rkeys:retrules}. Examples for retrieving fabric-related information include retrieval of:
Information on fabric coordinates, endpoints, and device distances are provided as \emph{reserved keys} as detailed in Chapter \ref{chap:api_rsvd_keys} - i.e., they are to be available at client start of execution and are subject to the retrieval rules of Section \ref{chap:api_rsvd_keys:retrules}. Examples for retrieving fabric-related information include retrieval of:

\begin{itemize}
\item An array of information on fabric devices for a node by passing \refattr{PMIX_FABRIC_DEVICES} as the key to \refapi{PMIx_Get} along with the \refattr{PMIX_HOSTNAME} of the node as a directive
Expand Down Expand Up @@ -202,7 +202,7 @@ \subsection{Fabric Coordinate Structure}
Note that the \refstruct{pmix_coord_t} structure does not imply nor mandate any requirement on how the coordinate data is to be stored within the \ac{PMIx} library. Implementers are free to store the coordinate in whatever format they choose.
\adviceimplend

A fabric coordinate is associated with a given fabric device and must be unique within a given view. Fabric devices are associated with the operating system which hosts them - thus, fabric coordinates are logically grouped within the \emph{node} realm (as described in Section \ref{api:struct:attributes:retrieval}) and can be retrieved per the rules detailed in Section \ref{chap:res:nrealm}.
A fabric coordinate is associated with a given fabric device and must be unique within a given view. Fabric devices are associated with the operating system which hosts them - thus, fabric coordinates are logically grouped within the \emph{node} realm (as described in Section \ref{api:struct:attributes:retrieval}) and can be retrieved per the rules detailed in Section \ref{chap:api_rsvd_keys:nrealm}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Fabric coordinate support macros}
Expand Down Expand Up @@ -291,7 +291,7 @@ \subsection{Fabric Geometry Structure}
Note that the \refstruct{pmix_coord_t} structure does not imply nor mandate any requirement on how the coordinate data is to be stored within the \ac{PMIx} library. Implementers are free to store the coordinate in whatever format they choose.
\adviceimplend

A fabric coordinate is associated with a given fabric device and must be unique within a given view. Fabric devices are associated with the operating system which hosts them - thus, fabric coordinates are logically grouped within the \emph{node} realm (as described in Section \ref{api:struct:attributes:retrieval}) and can be retrieved per the rules detailed in Section \ref{chap:res:nrealm}.
A fabric coordinate is associated with a given fabric device and must be unique within a given view. Fabric devices are associated with the operating system which hosts them - thus, fabric coordinates are logically grouped within the \emph{node} realm (as described in Section \ref{api:struct:attributes:retrieval}) and can be retrieved per the rules detailed in Section \ref{chap:api_rsvd_keys:nrealm}.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Expand Down Expand Up @@ -570,7 +570,7 @@ \section{Fabric Support Attributes}
%
%
\vspace{\baselineskip}
The following attributes are related to the \emph{node realm} (as described in Section \ref{chap:res:nrealm}) and are retrieved according to those rules.
The following attributes are related to the \emph{node realm} (as described in Section \ref{chap:api_rsvd_keys:nrealm}) and are retrieved according to those rules.

%
\declareAttributeNEW{PMIX_FABRIC_DEVICES}{"pmix.fab.devs"}{pmix_data_array_t}{
Expand Down Expand Up @@ -642,15 +642,15 @@ \section{Fabric Support Attributes}
}
%
\vspace{\baselineskip}
The following attributes are related to the \emph{process realm} (as described in Section \ref{chap:res:prealm}) and are retrieved according to those rules.
The following attributes are related to the \emph{process realm} (as described in Section \ref{chap:api_rsvd_keys:prealm}) and are retrieved according to those rules.

%
\declareAttributeNEW{PMIX_FABRIC_ENDPT}{"pmix.fab.endpt"}{pmix_data_array_t}{
Fabric endpoints for a specified process. As multiple endpoints may be assigned to a given process (e.g., in the case where multiple devices are associated with a package to which the process is bound), the returned values will be provided in a \refstruct{pmix_data_array_t} of \refstruct{pmix_endpoint_t} elements.
}
%
\vspace{\baselineskip}
The following attributes are related to the \emph{job realm} (as described in Section \ref{chap:res:jrealm}) and are retrieved according to those rules. Note that distances to fabric devices are retrieved using the \refattr{PMIX_DEVICE_DISTANCES} key with the appropriate \refstruct{pmix_device_type_t} qualifier.
The following attributes are related to the \emph{job realm} (as described in Section \ref{chap:api_rsvd_keys:jrealm}) and are retrieved according to those rules. Note that distances to fabric devices are retrieved using the \refattr{PMIX_DEVICE_DISTANCES} key with the appropriate \refstruct{pmix_device_type_t} qualifier.

%
\declareAttributeNEW{PMIX_SWITCH_PEERS}{"pmix.speers"}{pmix_data_array_t}{
Expand Down
130 changes: 87 additions & 43 deletions Chap_API_NonReserved_Keys.tex
Original file line number Diff line number Diff line change
Expand Up @@ -202,74 +202,118 @@ \section{Retrieval rules for non-reserved keys}
following precedence search:

\begin{enumerate}
\item If the \refattr{PMIX_GET_REFRESH_CACHE} attribute is given, then the
request is first forwarded to the local \ac{PMIx} server which will then
update the client's cache. Note that this may not, depending upon
implementation details, result in any action.

\item Check the local \ac{PMIx} client cache for the requested key - if not found and either the \refattr{PMIX_OPTIONAL} or \refattr{PMIX_GET_REFRESH_CACHE} attribute was given, the search will stop at this point and return the \refconst{PMIX_ERR_NOT_FOUND} status.

\item Request the information from the local \ac{PMIx} server. The server
will check its cache for the specified key within the appropriate scope as
defined by the process that originally posted the key. If the value exists
in a scope that contains the requesting process, then the value shall be
returned. If the value exists, but in a scope that excludes the requesting
process, then the server shall immediately return the

\item \textbf{Refresh the local \ac{PMIx} client cache, if requested.}\\
If the \refattr{PMIX_GET_REFRESH_CACHE} attribute is given to \refapi{PMIx_Get}
then the \ac{PMIx} client library will request and wait for a refresh of the local
\ac{PMIx} client cache from the local \ac{PMIx} server. The local \ac{PMIx} server
must ensure the latest key/value data from the specified process is in the local
\ac{PMIx} client cache before proceeding to the next step. If it cannot refresh
the local \ac{PMIx} client cache then it must return an error.
The \refattr{PMIX_GET_REFRESH_CACHE} attribute is helpful when accessing a
non-reserved key that is known to have its value changed over the life of the
program (e.g., an increment to an epoch value).

\item \textbf{Search the local \ac{PMIx} client cache for the requested key.}\\
If the key is found the search stops here and the value is returned to the caller.
If the key is not found and either the \refattr{PMIX_OPTIONAL} or
\refattr{PMIX_GET_REFRESH_CACHE} attribute was given to \refapi{PMIx_Get},
then the search will stop at this point and return the
\refconst{PMIX_ERR_NOT_FOUND} status. If the key is not found and neither of
those attributes are provided then the \refapi{PMIx_Get} proceeds to the next step.

\item \textbf{Request and wait for the information from the local \ac{PMIx} server.}\\
The server will check the local \ac{PMIx} server cache for the specified key
within the appropriate scope (as defined by the process that originally posted
the key). If the value exists in a scope that contains the requesting process,
then the value shall be returned. If the value exists, but in a scope that
excludes the requesting process, then the server shall immediately return the
\refconst{PMIX_ERR_EXISTS_OUTSIDE_SCOPE}.
% Future work: Can an implementation choose to return PMIX_ERR_NOT_FOUND instead
% of PMIX_ERR_EXISTS_OUTSIDE_SCOPE as an optimization. Otherwise this could require
% a more extensive search depending on how the data is organized.

If the value still isn't found and the \refattr{PMIX_IMMEDIATE} attribute
If the value still is not found and the \refattr{PMIX_IMMEDIATE} attribute
was given, then the library shall return the \refconst{PMIX_ERR_NOT_FOUND}
error constant to the requester. Otherwise, the \ac{PMIx} server library
will take one of the following actions:
will take one of the following actions to find the key:
\begin{compactitemize}
\item If the target process has a rank of \refconst{PMIX_RANK_UNDEF},
\item \textbf{If the request is for globally unique data not associated with a process:}\\
If the target process has a rank of \refconst{PMIX_RANK_UNDEF},
then this indicates that the key being requested is globally unique
and \emph{not} associated with a specific process. In this case, the
server shall hold the request until either the data appears at the
server or, if given, the \refattr{PMIX_TIMEOUT} is reached. In the
latter case, the server will return the \refconst{PMIX_ERR_TIMEOUT}
server or, if given, the \refattr{PMIX_TIMEOUT} is reached by using the
\refapi{PMIx_server_dmodex_request} function. In the
latter case, after the timeout has been reached without the arrival of
the specified data the server will return the \refconst{PMIX_ERR_TIMEOUT}
status. Note that the server may, depending on \ac{PMIx}
implementation, never respond if the caller failed to specify a
\refattr{PMIX_TIMEOUT} and the requested key fails to arrive at the
server.

\item If the target process is \emph{local} (i.e., attached to the

Note that there is no mechanism by which a \ac{PMIx} client can specify
\refconst{PMIX_RANK_UNDEF} with a key/value pair (i.e., via a \refapi{PMIx_Put}).
However, you can call \refapi{PMIx_Get} with \refconst{PMIX_RANK_UNDEF}.
This will result in the \refapi{PMIx_Get} operation searching all processes
in the specified namespace within the scope for the key. The application is
responsible to guarantee uniqueness of the keys.
% Note: This accommodation is not recommended best practice, but was put in
% place for those clients moving from PMI1/PMI2 which behave in this (likely)
% non-scalable way.

\item \textbf{If the request is for data from a \emph{local} process:}\\
If the target process is \emph{local} (i.e., connected to the
same \ac{PMIx} server), then the server will hold the request until
either the target process provides the data or, if given, the
\refattr{PMIX_TIMEOUT} is reached. In the latter case, the server will
\refattr{PMIX_TIMEOUT} is reached. In the latter case, after the timeout
has been reached without the arrival of the specified data the server will
return the \refconst{PMIX_ERR_TIMEOUT} status. Note that data which is
posted via \refapi{PMIx_Put} but not staged with \refapi{PMIx_Commit}
may, depending upon implementation, never appear at the server.

\item If the target process is \emph{remote} (i.e., not attached to
the same \ac{PMIx} server), the server will either:
posted via \refapi{PMIx_Put} may never appear at the \ac{PMIx} server
if there is no subsequent call to \refapi{PMIx_Commit}, depending
upon the \ac{PMIx} implementation.

\item \textbf{If the request is for data from a \emph{remote} process:}\\
If the target process is \emph{remote} (i.e., not connected to
the same \ac{PMIx} server), the \ac{PMIx} server library will attempt to
initiate a \emph{direct modex} request from the local \ac{RM} daemon to
the remote \ac{RM} daemon where that process resides.
The \ac{PMIx} server library will either:
\begin{compactitemize}
\item If the host has provided the
\item If the local \ac{RM} daemon does not support the
\refapi{pmix_server_dmodex_req_fn_t} interface, then
the \ac{PMIx} server-side library will immediately respond to the
\ac{PMIx} client with the \refconst{PMIX_ERR_NOT_FOUND} status.

\item If the local \ac{RM} daemon has provided the
\refapi{pmix_server_dmodex_req_fn_t} module function
interface, then the server
shall pass the request to its host for servicing. The host is
interface, then the \ac{PMIx} server-side library
shall pass the request to the local \ac{RM} daemon for servicing.
The local \ac{RM} daemon is
responsible for determining the location of the target process and
passing the request to the \ac{PMIx} server at that location.
passing the request to the \ac{RM} daemon at that remote location.

When the remote data request is received, the targeted remote
\ac{RM} daemon will check its cache for the specified key by calling
\refapi{PMIx_server_dmodex_request}. If the key is present
then the targeted remote \ac{RM} daemon will send the data to the
originating \ac{RM} daemon. The originating \ac{RM} daemon will then
pass the data into the completion callback (\refapi{pmix_modex_cbfunc_t})
for the \refapi{pmix_server_dmodex_req_fn_t}.

When the remote data request is received, the target \ac{PMIx}
server will check its cache for the specified key. If the key is
not present, the request shall be held until either the target
If the key is not present, the request shall be held until either the target
process provides the data or, if given, the \refattr{PMIX_TIMEOUT}
is reached. In the latter case, the server will return the
\refconst{PMIX_ERR_TIMEOUT} status. The host shall convey the
result back to the originating \ac{PMIx} server, which will reply
to the requesting client with the result of the request when the
host provides it.
is reached. In the latter case, the \ac{PMIx} server will return the
\refconst{PMIX_ERR_TIMEOUT} status. The targeted remote \ac{RM} daemon
shall convey the result back to the originating \ac{RM} daemon,
which will reply to the requesting \ac{PMIx} client with the result
of the request.

Note that the target server may, depending on \ac{PMIx}
Note that the target \ac{RM} daemon may, depending on \ac{PMIx}
implementation, never respond if the caller failed to specify a
\refattr{PMIX_TIMEOUT} and the target process fails to post the
requested key.

\item if the host does not support the
\refapi{pmix_server_dmodex_req_fn_t} interface, then
the server will immediately respond to the client with the
\refconst{PMIX_ERR_NOT_FOUND} status
\end{compactitemize}
\end{compactitemize}
\end{enumerate}
Expand Down
3 changes: 2 additions & 1 deletion Chap_API_Publish.tex
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,8 @@
\chapter{Publish/Lookup Operations}
\label{chap:pub}

Chapter~\ref{chap:api_rsvd_keys} and Chapter~\ref{chap:nrkeys} discussed how reserved and non-reserved keys dealt with
Chapter~\ref{chap:api_rsvd_keys} and Chapter~\ref{chap:data_sharing:non_rsvd_keys}
discussed how reserved and non-reserved keys dealt with
information that either was associated with a specific process (i.e., the
retrieving process knew the identifier of the process that posted it) or
required a synchronization operation prior to retrieval (e.g., the case of
Expand Down
Loading