Skip to content

Commit

Permalink
Merge branch 'develop' of https://github.com/CrayLabs/SmartSim in…
Browse files Browse the repository at this point in the history
…to deprecate
  • Loading branch information
amandarichardsonn committed Mar 13, 2024
2 parents 9dfd3c8 + 3c271f3 commit bd07742
Show file tree
Hide file tree
Showing 75 changed files with 6,004 additions and 842 deletions.
45 changes: 36 additions & 9 deletions doc/api/smartsim_api.rst
Original file line number Diff line number Diff line change
@@ -1,17 +1,15 @@

*************
SmartSim API
*************


.. _experiment_api:

Experiment
==========


.. currentmodule:: smartsim.experiment

.. _exp_init:
.. autosummary::

Experiment.__init__
Expand All @@ -34,6 +32,8 @@ Experiment
:members:


.. _settings-info:

Settings
========

Expand Down Expand Up @@ -377,23 +377,47 @@ container.
:undoc-members:
:members:

.. _orc_api:

Orchestrator
============

.. currentmodule:: smartsim.database

.. _orc_api:
.. autosummary::

Orchestrator.__init__
Orchestrator.db_identifier
Orchestrator.num_shards
Orchestrator.db_nodes
Orchestrator.hosts
Orchestrator.reset_hosts
Orchestrator.remove_stale_files
Orchestrator.get_address
Orchestrator.is_active
Orchestrator.set_cpus
Orchestrator.set_walltime
Orchestrator.set_hosts
Orchestrator.set_batch_arg
Orchestrator.set_run_arg
Orchestrator.enable_checkpoints
Orchestrator.set_max_memory
Orchestrator.set_eviction_strategy
Orchestrator.set_max_clients
Orchestrator.set_max_message_size
Orchestrator.set_db_conf

Orchestrator
------------

.. _orchestrator_api:

.. autoclass:: Orchestrator
:members:
:inherited-members:
:undoc-members:

.. _model_api:

Model
=====
Expand All @@ -417,17 +441,17 @@ Model
Model.disable_key_prefixing
Model.query_key_prefixing

Model
-----

.. autoclass:: Model
:members:
:show-inheritance:
:inherited-members:

.. _ensemble_api:

Ensemble
========


.. currentmodule:: smartsim.entity.ensemble

.. autosummary::
Expand All @@ -443,6 +467,11 @@ Ensemble
Ensemble.query_key_prefixing
Ensemble.register_incoming_entity

Ensemble
--------

.. _ensemble_api:

.. autoclass:: Ensemble
:members:
:show-inheritance:
Expand All @@ -461,7 +490,6 @@ SmartSim includes built-in utilities for supporting TensorFlow, Keras, and Pytor
TensorFlow
----------


SmartSim includes built-in utilities for supporting TensorFlow and Keras in training and inference.

.. currentmodule:: smartsim.ml.tf.utils
Expand Down Expand Up @@ -510,7 +538,6 @@ SmartSim includes built-in utilities for supporting PyTorch in training and infe
Slurm
=====


.. currentmodule:: smartsim.wlm.slurm

.. autosummary::
Expand Down
127 changes: 127 additions & 0 deletions doc/batch_settings.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
.. _batch_settings_doc:

**************
Batch Settings
**************
========
Overview
========
SmartSim provides functionality to launch entities (``Model`` or ``Ensemble``)
as batch jobs supported by the ``BatchSettings`` base class. While the ``BatchSettings`` base
class is not intended for direct use by users, its derived child classes offer batch
launching capabilities tailored for specific workload managers (WLMs). Each SmartSim
`launcher` interfaces with a ``BatchSettings`` subclass specific to a system's WLM:

- The Slurm `launcher` supports:
- :ref:`SbatchSettings<sbatch_api>`
- The PBS Pro `launcher` supports:
- :ref:`QsubBatchSettings<qsub_api>`
- The LSF `launcher` supports:
- :ref:`BsubBatchSettings<bsub_api>`

.. note::
The local `launcher` does not support batch jobs.

After creating a ``BatchSettings`` instance, users gain access to the methods
of the associated child class, providing them with the ability to further configure the batch
settings for jobs.

In the following :ref:`Examples<batch_settings_ex>` subsection, we demonstrate the initialization
and configuration of a batch settings object.

.. _batch_settings_ex:

========
Examples
========
A ``BatchSettings`` child class is created using the ``Experiment.create_batch_settings``
factory method. When the user initializes the ``Experiment`` at the beginning of the Python driver script,
they may specify a `launcher` argument. SmartSim will then register or detect the `launcher` and return the
corresponding supported child class when ``Experiment.create_batch_settings`` is called. This
design allows SmartSim driver scripts utilizing ``BatchSettings`` to be portable between systems,
requiring only a change in the specified `launcher` during ``Experiment`` initialization.

Below are examples of how to initialize a ``BatchSettings`` object per `launcher`.

.. tabs::

.. group-tab:: Slurm
To instantiate the ``SbatchSettings`` object, which interfaces with the Slurm job scheduler, specify
`launcher="slurm"` when initializing the ``Experiment``. Upon calling ``create_batch_settings``,
SmartSim will detect the job scheduler and return the appropriate batch settings object.

.. code-block:: python
from smartsim import Experiment
# Initialize the experiment and provide launcher Slurm
exp = Experiment("name-of-experiment", launcher="slurm")
# Initialize a SbatchSettings object
sbatch_settings = exp.create_batch_settings(nodes=1, time="10:00:00")
# Set the account for the slurm batch job
sbatch_settings.set_account("12345-Cray")
# Set the partition for the slurm batch job
sbatch_settings.set_queue("default")
The initialized ``SbatchSettings`` instance can now be passed to a SmartSim entity
(``Model`` or ``Ensemble``) via the `batch_settings` argument in ``create_batch_settings``.

.. note::
If `launcher="auto"`, SmartSim will detect that the ``Experiment`` is running on a Slurm based
machine and set the launcher to `"slurm"`.

.. group-tab:: PBS Pro
To instantiate the ``QsubBatchSettings`` object, which interfaces with the PBS Pro job scheduler, specify
`launcher="pbs"` when initializing the ``Experiment``. Upon calling ``create_batch_settings``,
SmartSim will detect the job scheduler and return the appropriate batch settings object.

.. code-block:: python
from smartsim import Experiment
# Initialize the experiment and provide launcher PBS Pro
exp = Experiment("name-of-experiment", launcher="pbs")
# Initialize a QsubBatchSettings object
qsub_batch_settings = exp.create_batch_settings(nodes=1, time="10:00:00")
# Set the account for the PBS Pro batch job
qsub_batch_settings.set_account("12345-Cray")
# Set the partition for the PBS Pro batch job
qsub_batch_settings.set_queue("default")
The initialized ``QsubBatchSettings`` instance can now be passed to a SmartSim entity
(``Model`` or ``Ensemble``) via the `batch_settings` argument in ``create_batch_settings``.

.. note::
If `launcher="auto"`, SmartSim will detect that the ``Experiment`` is running on a PBS Pro based
machine and set the launcher to `"pbs"`.

.. group-tab:: LSF
To instantiate the ``BsubBatchSettings`` object, which interfaces with the LSF job scheduler, specify
`launcher="lsf"` when initializing the ``Experiment``. Upon calling ``create_batch_settings``,
SmartSim will detect the job scheduler and return the appropriate batch settings object.

.. code-block:: python
from smartsim import Experiment
# Initialize the experiment and provide launcher LSF
exp = Experiment("name-of-experiment", launcher="lsf")
# Initialize a BsubBatchSettings object
bsub_batch_settings = exp.create_batch_settings(nodes=1, time="10:00:00", batch_args={"ntasks": 1})
# Set the account for the lsf batch job
bsub_batch_settings.set_account("12345-Cray")
# Set the partition for the lsf batch job
bsub_batch_settings.set_queue("default")
The initialized ``BsubBatchSettings`` instance can now be passed to a SmartSim entity
(``Model`` or ``Ensemble``) via the `batch_settings` argument in ``create_batch_settings``.

.. note::
If `launcher="auto"`, SmartSim will detect that the ``Experiment`` is running on a LSF based
machine and set the launcher to `"lsf"`.

.. warning::
Note that initialization values provided (e.g., `nodes`, `time`, etc) will overwrite the same arguments in `batch_args` if present.
4 changes: 4 additions & 0 deletions doc/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,15 @@ To be released at some future point in time

Description

- SmartSim Documentation refactor
- Update the version of Redis from `7.0.4` to `7.2.4`
- Update Experiment API typing
- Fix publishing of development docs

Detailed Notes

- Implemented new structure of SmartSim documentation. Added examples
images and further detail of SmartSim components.
- Update Redis version to `7.2.4`. This change fixes an issue in the Redis
build scripts causing failures on Apple Silicon hosts. (SmartSim-PR507_)
- The container which builds the documentation for every merge to develop
Expand All @@ -33,6 +36,7 @@ Detailed Notes
(SmartSim-PR-PR504_)
- Update the generic `t.Any` typehints in Experiment API. (SmartSim-PR501_)

.. _SmartSim-PR463: https://github.com/CrayLabs/SmartSim/pull/463
.. _SmartSim-PR507: https://github.com/CrayLabs/SmartSim/pull/507
.. _SmartSim-PR504: https://github.com/CrayLabs/SmartSim/pull/504
.. _SmartSim-PR501: https://github.com/CrayLabs/SmartSim/pull/501
Expand Down
30 changes: 27 additions & 3 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,11 @@
'breathe',
'nbsphinx',
'sphinx_copybutton',
'sphinx_tabs.tabs'
'sphinx_tabs.tabs',
'sphinx_design',
]

autodoc_mock_imports = ["smartredis.smartredisPy"]
suppress_warnings = ['autosectionlabel']

# Add any paths that contain templates here, relative to this directory.
Expand Down Expand Up @@ -82,7 +84,6 @@
# a list of builtin themes.
html_theme = "sphinx_book_theme"


# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
Expand All @@ -104,8 +105,31 @@
# white background with dark themes. If sphinx-tabs updates its
# static/tabs.css, this may need to be updated.
html_css_files = ['custom_tab_style.css']

autoclass_content = 'both'
add_module_names = False

nbsphinx_execute = 'never'

from inspect import getsourcefile

# Get path to directory containing this file, conf.py.
DOCS_DIRECTORY = os.path.dirname(os.path.abspath(getsourcefile(lambda: 0)))

def ensure_pandoc_installed(_):
import pypandoc

# Download pandoc if necessary. If pandoc is already installed and on
# the PATH, the installed version will be used. Otherwise, we will
# download a copy of pandoc into docs/bin/ and add that to our PATH.
pandoc_dir = os.path.join(DOCS_DIRECTORY, "bin")
# Add dir containing pandoc binary to the PATH environment variable
if pandoc_dir not in os.environ["PATH"].split(os.pathsep):
os.environ["PATH"] += os.pathsep + pandoc_dir
pypandoc.ensure_pandoc_installed(
targetfolder=pandoc_dir,
delete_installer=True,
)


def setup(app):
app.connect("builder-inited", ensure_pandoc_installed)
Loading

0 comments on commit bd07742

Please sign in to comment.