Skip to content

Commit

Permalink
Detect OpenMPI Use (CrayLabs#186)
Browse files Browse the repository at this point in the history
Adds new run settings classes for using different
OpenMPI executables (mpiexec, orterun). Because
the executables share runtime arguments with mpirun
much the functionality has be transfered to a base class
that all OpenMPI run settings inherit from. Adds tests
and docs updates to reflect changes.

[ committed by @MattToast  ]
[ reviewed by @Spartee @al-rigazzi ]
  • Loading branch information
MattToast authored and al-rigazzi committed May 16, 2022
1 parent 8858465 commit 82209d0
Show file tree
Hide file tree
Showing 15 changed files with 390 additions and 53 deletions.
58 changes: 57 additions & 1 deletion doc/api/smartsim_api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ Types of Settings:
SrunSettings
AprunSettings
MpirunSettings
MpiexecSettings
OrterunSettings
JsrunSettings
SbatchSettings
QsubBatchSettings
Expand Down Expand Up @@ -186,7 +188,7 @@ and within batch launches (i.e. ``BsubBatchSettings``)
MpirunSettings
--------------

.. _openmpi_api:
.. _openmpi_run_api:

``MpirunSettings`` are for launching with OpenMPI. ``MpirunSettings`` are
supported on Slurm, PBSpro, and Cobalt.
Expand All @@ -210,6 +212,60 @@ supported on Slurm, PBSpro, and Cobalt.
:members:


MpiexecSettings
---------------

.. _openmpi_exec_api:

``MpiexecSettings`` are for launching with OpenMPI's ``mpiexec``. ``MpirunSettings`` are
supported on Slurm, PBSpro, and Cobalt.


.. autosummary::

MpiexecSettings.set_cpus_per_task
MpiexecSettings.set_hostlist
MpiexecSettings.set_tasks
MpiexecSettings.set_task_map
MpiexecSettings.make_mpmd
MpiexecSettings.add_exe_args
MpiexecSettings.format_run_args
MpiexecSettings.format_env_vars
MpiexecSettings.update_env

.. autoclass:: MpiexecSettings
:inherited-members:
:undoc-members:
:members:


OrterunSettings
---------------

.. _openmpi_orte_api:

``OrterunSettings`` are for launching with OpenMPI's ``orterun``. ``OrterunSettings`` are
supported on Slurm, PBSpro, and Cobalt.


.. autosummary::

OrterunSettings.set_cpus_per_task
OrterunSettings.set_hostlist
OrterunSettings.set_tasks
OrterunSettings.set_task_map
OrterunSettings.make_mpmd
OrterunSettings.add_exe_args
OrterunSettings.format_run_args
OrterunSettings.format_env_vars
OrterunSettings.update_env

.. autoclass:: OrterunSettings
:inherited-members:
:undoc-members:
:members:


------------------------------------------


Expand Down
2 changes: 1 addition & 1 deletion doc/experiment.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Each launcher supports specific types of ``RunSettings``.

- :ref:`SrunSettings <srun_api>` for Slurm
- :ref:`AprunSettings <aprun_api>` for PBSPro and Cobalt
- :ref:`MpirunSettings <openmpi_api>` for OpenMPI with `mpirun` on PBSPro, Cobalt, LSF, and Slurm
- :ref:`MpirunSettings <openmpi_run_api>` for OpenMPI with `mpirun` on PBSPro, Cobalt, LSF, and Slurm
- :ref:`JsrunSettings <jsrun_api>` for LSF

These settings can be manually specified by the user, or auto-detected by the
Expand Down
20 changes: 12 additions & 8 deletions doc/launchers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -95,9 +95,10 @@ To use the Slurm launcher, specify at ``Experiment`` initialization:
Running on Slurm
----------------

The Slurm launcher supports two types of ``RunSettings``:
The Slurm launcher supports three types of ``RunSettings``:
1. :ref:`SrunSettings <srun_api>`
2. :ref:`MpirunSettings <openmpi_api>`
2. :ref:`MpirunSettings <openmpi_run_api>`
3. :ref:`MpiexecSettings <openmpi_exec_api>`

As well as batch settings for ``sbatch`` through:
1. :ref:`SbatchSettings <sbatch_api>`
Expand Down Expand Up @@ -204,9 +205,10 @@ To use the PBSpro launcher, specify at ``Experiment`` initialization:
Running on PBSpro
-----------------

The PBSpro launcher supports two types of ``RunSettings``:
The PBSpro launcher supports three types of ``RunSettings``:
1. :ref:`AprunSettings <aprun_api>`
2. :ref:`MpirunSettings <openmpi_api>`
2. :ref:`MpirunSettings <openmpi_run_api>`
3. :ref:`MpiexecSettings <openmpi_exec_api>`

As well as batch settings for ``qsub`` through:
1. :ref:`QsubBatchSettings <qsub_api>`
Expand Down Expand Up @@ -235,9 +237,10 @@ To use the Cobalt launcher, specify at ``Experiment`` initialization:
Running on Cobalt
-----------------

The Cobalt launcher supports two types of ``RunSettings``:
The Cobalt launcher supports three types of ``RunSettings``:
1. :ref:`AprunSettings <aprun_api>`
2. :ref:`MpirunSettings <openmpi_api>`
2. :ref:`MpirunSettings <openmpi_run_api>`
3. :ref:`MpiexecSettings <openmpi_exec_api>`

As well as batch settings for ``qsub`` through:
1. :ref:`CobaltBatchSettings <cqsub_api>`
Expand Down Expand Up @@ -266,9 +269,10 @@ To use the LSF launcher, specify at ``Experiment`` initialization:
Running on LSF
--------------

The LSF launcher supports two types of ``RunSettings``:
The LSF launcher supports three types of ``RunSettings``:
1. :ref:`JsrunSettings <jsrun_api>`
2. :ref:`MpirunSettings <openmpi_api>`
2. :ref:`MpirunSettings <openmpi_run_api>`
3. :ref:`MpiexecSettings <openmpi_exec_api>`

As well as batch settings for ``bsub`` through:
1. :ref:`BsubBatchSettings <bsub_api>`
Expand Down
4 changes: 3 additions & 1 deletion smartsim/settings/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from .base import RunSettings
from .cobaltSettings import CobaltBatchSettings
from .lsfSettings import BsubBatchSettings, JsrunSettings
from .mpirunSettings import MpirunSettings
from .mpirunSettings import MpirunSettings, MpiexecSettings, OrterunSettings
from .pbsSettings import QsubBatchSettings
from .slurmSettings import SbatchSettings, SrunSettings

Expand All @@ -12,6 +12,8 @@
"BsubBatchSettings",
"JsrunSettings",
"MpirunSettings",
"MpiexecSettings",
"OrterunSettings",
"QsubBatchSettings",
"RunSettings",
"SbatchSettings",
Expand Down
101 changes: 96 additions & 5 deletions smartsim/settings/mpirunSettings.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,23 +24,30 @@
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

import subprocess as sp
import re

from ..error import SSUnsupportedError
from ..log import get_logger
from .base import RunSettings

logger = get_logger(__name__)


class MpirunSettings(RunSettings):
def __init__(self, exe, exe_args=None, run_args=None, env_vars=None, **kwargs):
"""Settings to run job with ``mpirun`` command (OpenMPI)
class _OpenMPISettings(RunSettings):
"""Base class for all common arguments of OpenMPI run commands"""

def __init__(
self, exe, exe_args=None, run_command="", run_args=None, env_vars=None, **kwargs
):
"""Settings to format run job with an OpenMPI binary
Note that environment variables can be passed with a None
value to signify that they should be exported from the current
environment
Any arguments passed in the ``run_args`` dict will be converted
into ``mpirun`` arguments and prefixed with ``--``. Values of
command line arguments and prefixed with ``--``. Values of
None can be provided for arguments that do not have values.
:param exe: executable
Expand All @@ -55,7 +62,7 @@ def __init__(self, exe, exe_args=None, run_args=None, env_vars=None, **kwargs):
super().__init__(
exe,
exe_args,
run_command="mpirun",
run_command=run_command,
run_args=run_args,
env_vars=env_vars,
**kwargs,
Expand Down Expand Up @@ -236,3 +243,87 @@ def format_env_vars(self):
else:
formatted += ["-x", name]
return formatted


class MpirunSettings(_OpenMPISettings):
def __init__(self, exe, exe_args=None, run_args=None, env_vars=None, **kwargs):
"""Settings to run job with ``mpirun`` command (OpenMPI)
Note that environment variables can be passed with a None
value to signify that they should be exported from the current
environment
Any arguments passed in the ``run_args`` dict will be converted
into ``mpirun`` arguments and prefixed with ``--``. Values of
None can be provided for arguments that do not have values.
:param exe: executable
:type exe: str
:param exe_args: executable arguments, defaults to None
:type exe_args: str | list[str], optional
:param run_args: arguments for run command, defaults to None
:type run_args: dict[str, str], optional
:param env_vars: environment vars to launch job with, defaults to None
:type env_vars: dict[str, str], optional
"""
super().__init__(exe, exe_args, "mpirun", run_args, env_vars, **kwargs)

version_stmt = sp.check_output([self.run_command, "-V"]).decode()
if not re.match(r"mpirun\s\(Open MPI\)\s4.\d+.\d+", version_stmt):
logger.warning("Non-OpenMPI implementation of `mpirun` detected")


class MpiexecSettings(_OpenMPISettings):
def __init__(self, exe, exe_args=None, run_args=None, env_vars=None, **kwargs):
"""Settings to run job with ``mpiexec`` command (OpenMPI)
Note that environment variables can be passed with a None
value to signify that they should be exported from the current
environment
Any arguments passed in the ``run_args`` dict will be converted
into ``mpiexec`` arguments and prefixed with ``--``. Values of
None can be provided for arguments that do not have values.
:param exe: executable
:type exe: str
:param exe_args: executable arguments, defaults to None
:type exe_args: str | list[str], optional
:param run_args: arguments for run command, defaults to None
:type run_args: dict[str, str], optional
:param env_vars: environment vars to launch job with, defaults to None
:type env_vars: dict[str, str], optional
"""
super().__init__(exe, exe_args, "mpiexec", run_args, env_vars, **kwargs)

version_stmt = sp.check_output([self.run_command, "-V"]).decode()
if not re.match(r"mpiexec\s\(OpenRTE\)\s4.\d+.\d+", version_stmt):
logger.warning("Non-OpenMPI implementation of `mpiexec` detected")


class OrterunSettings(_OpenMPISettings):
def __init__(self, exe, exe_args=None, run_args=None, env_vars=None, **kwargs):
"""Settings to run job with ``orterun`` command (OpenMPI)
Note that environment variables can be passed with a None
value to signify that they should be exported from the current
environment
Any arguments passed in the ``run_args`` dict will be converted
into ``orterun`` arguments and prefixed with ``--``. Values of
None can be provided for arguments that do not have values.
:param exe: executable
:type exe: str
:param exe_args: executable arguments, defaults to None
:type exe_args: str | list[str], optional
:param run_args: arguments for run command, defaults to None
:type run_args: dict[str, str], optional
:param env_vars: environment vars to launch job with, defaults to None
:type env_vars: dict[str, str], optional
"""
super().__init__(exe, exe_args, "orterun", run_args, env_vars, **kwargs)

version_stmt = sp.check_output([self.run_command, "-V"]).decode()
if not re.match(r"orterun\s\(OpenRTE\)\s4.\d+.\d+", version_stmt):
logger.warning("Non-OpenMPI implementation of `orterun` detected")
10 changes: 6 additions & 4 deletions smartsim/settings/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,16 +123,18 @@ def create_run_settings(
"aprun": AprunSettings,
"srun": SrunSettings,
"mpirun": MpirunSettings,
"mpiexec": MpiexecSettings,
"orterun": OrterunSettings,
"jsrun": JsrunSettings,
}

# run commands supported by each launcher
# in order of suspected user preference
by_launcher = {
"slurm": ["srun", "mpirun"],
"pbs": ["aprun", "mpirun"],
"cobalt": ["aprun", "mpirun"],
"lsf": ["jsrun", "mpirun"],
"slurm": ["srun", "mpirun", "mpiexec"],
"pbs": ["aprun", "mpirun", "mpiexec"],
"cobalt": ["aprun", "mpirun", "mpiexec"],
"lsf": ["jsrun", "mpirun", "mpiexec"],
}

if launcher == "auto":
Expand Down
5 changes: 5 additions & 0 deletions tests/test_configs/mpi_impl_stubs/intel2019/mpiexec
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/sh

echo "Intel(R) MPI Library for Linux* OS, Version 2019 Update 9 Build 20200923 (id: abd58e492)
Copyright 2003-2020, Intel Corporation.
"
5 changes: 5 additions & 0 deletions tests/test_configs/mpi_impl_stubs/intel2019/mpirun
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/sh

echo "Intel(R) MPI Library for Linux* OS, Version 2019 Update 9 Build 20200923 (id: abd58e492)
Copyright 2003-2020, Intel Corporation.
"
3 changes: 3 additions & 0 deletions tests/test_configs/mpi_impl_stubs/intel2019/orterun
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/sh

echo "Not a real orterun"
5 changes: 5 additions & 0 deletions tests/test_configs/mpi_impl_stubs/openmpi4/mpiexec
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/sh

echo "mpiexec (OpenRTE) 4.1.2
Report bugs to http://www.open-mpi.org/community/help/"
5 changes: 5 additions & 0 deletions tests/test_configs/mpi_impl_stubs/openmpi4/mpirun
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/sh

echo "mpirun (Open MPI) 4.1.2
Report bugs to http://www.open-mpi.org/community/help/"
5 changes: 5 additions & 0 deletions tests/test_configs/mpi_impl_stubs/openmpi4/orterun
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/sh

echo "orterun (OpenRTE) 4.1.2
Report bugs to http://www.open-mpi.org/community/help/"
7 changes: 4 additions & 3 deletions tests/test_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@

from smartsim import Experiment
from smartsim.error import EntityExistsError, SSUnsupportedError
from smartsim.settings import RunSettings, MpirunSettings
from smartsim.settings import RunSettings
from smartsim.settings.mpirunSettings import _OpenMPISettings


def test_register_incoming_entity_preexists():
Expand All @@ -26,10 +27,10 @@ def test_disable_key_prefixing():

def test_catch_colo_mpmd_model():
exp = Experiment("experiment", launcher="local")
rs = MpirunSettings("python", exe_args="sleep.py")
rs = _OpenMPISettings("python", exe_args="sleep.py")

# make it an mpmd model
rs_2 = MpirunSettings("python", exe_args="sleep.py")
rs_2 = _OpenMPISettings("python", exe_args="sleep.py")
rs.make_mpmd(rs_2)

model = exp.create_model("bad_colo_model", rs)
Expand Down
Loading

0 comments on commit 82209d0

Please sign in to comment.