-
Notifications
You must be signed in to change notification settings - Fork 1
sessions_cheat_sheet
This wiki gives an overview of the MPI Sessions API as currently proposed, and how to use Sessions to create communicators and hence do all the great things that the current MPI API allows one to do, once one has a communicator.
The figure below illustrates the general scheme for using an MPI Session:
- First one creates a Session using
MPI_Sesson_init
, which returns a Session handle to the application; - This Session handle can then be used to query the runtime using the
MPI_Session_get_names
method. This returns a NULL terminated array of strings representing names of availableprocess sets
. Currently the WG is favoring the use of URI-style format, e.g.mpi://WORLD
being the name of the process set corresponding toMPI_COMM_WORLD
(in a pre-Sessions MPI world). - A MPI group can be instantiated from a Session handle and one of the process set names returned from
MPI_Session_get_names
using theMPI_Group_create_from_session
method: - The application can then use the
MPI_Comm_create_from_group
method to obtain a communicator.
Process sets are the mechanism for MPI applications to query the runtime. Each process set has a unique set name
. In the current
scheme, set names have a URI format. Two process sets are mandated:
mpi://WORLD
mpi://SELF
and maybe
mpi://UNIVERSE
Many additional process sets may be defined by the runtime, e.g.
location://rack/19
network://leaf-switch/37
arch://x86_64
application://redis-server/5
Mechanisms for defining process sets, and how system resources are assigned to these sets is currently assumed to be runtime implementation dependent.
A process set caches key/value tuples which an application can access using MPI_Session_get_info
, and subsequent queries of the
returned info
object using existing MPI info object methods. The size
key is mandatory for all process sets.
This is a whole different discussion and will be written up in a separate wiki. [Spoiler - needed: no, supported: yes.]
This is also a whole different discussion and will be written up in a separate wiki.
MPI_Session_init(
INOUT MPI_Flags *flags,
IN MPI_Info info,
IN MPI_Errhandler errhandler, (we need something else here since err handlers now on specific object types)
OUT MPI_Session *session)
This function initializes a Session and returns the associated Session handle. The flags argument is currently thought of as a place where the application can request capabilities it would like to have for MPI objects associated with the Session, and as output what the implementation can provide. Right now this is what we have for possible flags:
MPI_FLAG_THREAD_NONCONCURRENT_SINGLE
MPI_FLAG_THREAD_NONCONCURRENT_FUNNELED
MPI_FLAG_THREAD_NONCONCURRENT_SERIALIZED
MPI_FLAG_THREAD_CONCURRENT
The info
argument can be used for specifying
the level of thread safety required for the Session, and possibly other MPI implementation
specific resource and functionality requirements. The errhandler
argument specifies an error handler to invoke in the
event that the Session initialization call encounters an error. Session initialization
is intended to be a lightweight operation. A single process may initialize multiple Sessions. MPI_Session_init
is always
thread safe; multiple threads within an application may invoke it concurrently.
MPI_Session_finalize(
INOUT MPI_Session *session)
This function is the Session equivalent of MPI_Finalize
. It can block waiting for destruction of objects derived from the Session handle.
Every initialized Session must be finalized using MPI_Session_finalize
.
MPI_Session_get_names(
IN MPI_Session session,
OUT char **set_names)
This function is used to query the runtime for the names of available process sets
. The names are returned as a NULL terminated array of strings.
The caller is responsible for freeing the returned array of strings.
MPI_Session_get_info(
IN MPI_Session session,
IN const char *set_name,
OUT MPI_Info *info)
This function is used to query properties of a specific process set
. The returned info
object can in turn be queried with existing MPI info object query
functions.
MPI_Group_create_from_session_name(
IN MPI_Session session,
IN const char *set_name,
OUT MPI_Group *group);
This function can be used to create an MPI group given an input Session handle and a set name
. The existing MPI_Comm_create_group
function may be
subsequently used to create an MPI communicator.
MPI_Create_comm_from_group(
IN MPI_Group group,
IN const char *uri, // for matching (see next slide)
IN MPI_Info info,
IN MPI_Errhandler errhander,
OUT MPI_Comm *comm)
This function is proposed as an alternative way to create a MPI communicator from a MPI group. The tag
argument allows the MPI implementation to
discriminate between potentially concurrent calls by the application to create multiple MPI communicators using the same supplied group. The function
also allows for an alternate errhandler
to be invoked if the MPI_Create_comm_from_group
method encounters an internal error. Communicators derived from a static process set will have the same local rank, regardless of the session with which the communicator is associated.
Additional functions for creating communicators from MPI groups that the WG is proposing are:
MPI_Create_cart_comm_from_group(
IN MPI_Group group,
IN const char *uri,
IN MPI_Info info,
IN MPI_Errhandler errhander,
IN int ndims,
IN const int dims[],
IN const int periods[],
IN int reorder,
OUT MPI_Comm *comm)
MPI_Create_graph_comm_from_group(…)
MPI_Create_dist_graph_comm_from_group(…)
MPI_Create_dist_graph_adjacent_comm_from_group(…)
and
MPI_Create_intercomm_from_group(
IN MPI_Group local_group,
IN int local leader,
IN MPI_Group remote_group,
IN int remote_leader,
IN const char *uri,
IN MPI_Info info,
IN MPI_Errhandler errhander,
OUT MPI_Comm *comm)
But wait, there's more. While the WG was at it, the following were also thrown in to the mix
MPI_Create_win_from_group(
IN MPI_Group group,
IN void *base,
IN MPI_Aint size,
IN int disp_unit,
IN const char *uri,
IN MPI_Info info,
IN MPI_Errhandler errhander, // do we want this?
OUT MPI_Win *win)
and
MPI_Create_file_from_group(
IN MPI_Group group,
IN const char *filename,
IN int amode,
IN const char *uri, // necessary/desirable?
IN MPI_Info info,
IN MPI_Errhandler errhander, // do we want this?
OUT MPI_File *file)
Maybe all these new communicator from group functions could be handled as a separate proposal.
Within a single MPI process:
- objects derived from Session A cannot be used to communicate with objects derived from Session B
- Cannot have requests from different Sessions in a single call to the array TEST/WAIT functions