CrayLabs · ashao · Jun 22, 2022 · Jun 8, 2022 · Jun 20, 2022 · Jun 21, 2022
diff --git a/doc/index.rst b/doc/index.rst
@@ -40,6 +40,7 @@
    :caption: SmartRedis
 
    smartredis
+   sr_integration
    sr_python_walkthrough
    sr_cpp_walkthrough
    sr_fortran_walkthrough

diff --git a/doc/sr_integration.rst b/doc/sr_integration.rst
@@ -0,0 +1,229 @@
+*****************************
+Integrating into a Simulation
+*****************************
+
+========
+Overview
+========
+
+This document provides some general guidelines to integrate the SmartRedis client
+into existing simulation codebases. Developers of these simulation codebases will
+need to identify the exact places to add the code; generally SmartRedis calls
+will only need to be added in two places:
+
+1. Initialization
+2. Main loop
+
+==============
+Initialization
+==============
+
++++++++++++++++++++
+Creating the client
++++++++++++++++++++
+
+The SmartRedis client must be initialized before it can be used to communicate
+with the orchestrator. In the C++ and Python versions of the clients, this is done
+when creating a new client. In the C and Fortran client an `initialize`
+method must be called.
+
+C++::
+
+    #include "client.h"
+    SmartRedis::Client client(use_cluster);
+
+Python::
+
+    from smartredis import Client
+    client = Client(use_cluster)
+
+Fortran::
+
+    use smartredis_client, only : client_type
+    include "enum_fortran.inc"
+
+    type(client_type) :: client
+    return_code = client%initialize(use_cluster)
+    if (return_code /= SRNoError) stop 'Error in initialization'
+
+C::
+
+    #include "client.h"
+    #include "sr_enums.h"
+    void* client = NULL;
+    return_code = SmartRedisCClient(use_cluster, &client)
+    if (return_code != SRNoError) {
+        return -1
+    }
+
+All these methods only have one configurable parameter -- indicated in the
+above cases by the variable `use_cluster`. If this parameter is true,
+then the client expects to be able to communicate with an orchestrator with
+three or more shards.
+
+++++++++++++++++++++++++++++++++++++++++++
+(Parallel Programs): Creating unique names
+++++++++++++++++++++++++++++++++++++++++++
+
+For parallel applications, each rank or thread that is communicating with
+the orchestrator will likely need to create a unique prefix for names to prevent
+another rank or thread inadvertently overwriting data. This prefix should be used
+for when creating the name of a tensor, dataset, and model that needs to be unique
+to a given rank. (Note: for models run within SmartSim, Additional prefixing may
+done by the client when running an ensemble and/or multiple data sources).
+Any identifier can be used, though typically the MPI rank number (or equivalent
+identifier) is a useful, unique number.
+
+C++::
+
+    const std::string name_prefix << std::format("{:06}_", *rank_id);
+
+Python::
+
+    name_prefix = f"{rank_id:06d}_"
+
+Fortran::
+
+    character(len=12) :: name_prefix
+    write(name_prefix,'(A,I6.6)') rank_id
+
+C::
+
+    char[7] name_prefix;
+    name_prefix = sprintf(name_prefix", "%06d\0", *rank_id);
+
+++++++++++++++++++++++++++
+Storing scripts and models
+++++++++++++++++++++++++++
+
+The last task that typically needs to be done is to store models or scripts
+that will be used later in the simulation. When using a clustered orchestrator,
+this only needs to be done by one client (unless each rank requires a different
+model or script). MPI rank 0 is often a convenient choice to set models and
+scripts.
+
+C++::
+
+    if (root_client) {
+        client.set_model_from_file(model_name, model_file, backend, device)
+    }
+
+Python::
+
+    if root_client:
+        client.set_model_from_file(model_name, model_file, backend, device)
+
+Fortran::
+
+    if (root_client) return_code = client%set_model_from_file(model_name, model_file, backend, device)
+    if (return_code /= SRNoError) stop 'Error setting model'
+
+C::
+
+    if (root_client) {
+        return_code = client.set_model_from_file(client, model_name, model_file, backend, device)
+        if (return_code != SRNoError) {
+            return -1
+        }
+    }
+
+=========
+Main loop
+=========
+
+Within the main loop of the code (e.g. every timestep or iteration of a solver),
+the developer typically uses the SmartRedis client methods to implement a workflow which
+may include receiving data, sending data, running a script or model, and/or
+retrieving a result. These workflows are covered extensively in the walkthroughs
+for the Fortran, C++, and python clients and the integrations with MOM6, OpenFOAM,
+LAMMPS, and others.
+
+Generally though, developers are advised to:
+
+1. Find locations where file I/O would normally happen and either augment
+   or replace code to use the SmartRedis client and store the data in the
+   orchestrator
+2. Use the `name_prefix` created during initialization to avoid accidental
+   writes/reads from different clients
+3. Use the SmartSim `dataset` type when using clients representing decomposed
+   subdomains to make the retrieval/use of the data more performant
+
+============
+Full example
+============
+
+The following pseudocode is used to demonstrate various aspects of instrumenting an
+existing simulation code with SmartRedis. This code is representative of solving
+the time-evolving heat equation. but we will augment it using an ML model to
+provide a preconditioning step each iteration and post the state of the simulation
+to the orchestrator. ::
+
+    program main
+
+        ! Initialize the model, setup MPI, communications, read input files
+        call initialize_model(temperature, number_of_timesteps)
+
+        main_loop: do i=1,number_of_timesteps
+
+            ! Write the current state of the simulation to a file
+            call write_current_state(temperature)
+
+            ! Call a time integrator to step the temperature field forward
+            call timestep_simulation(temperature)
+
+        enddo
+    end program main
+
+Following the guidelines from above, the first step is to initialize the client
+and create a unique identifier for the given processor. This should be done
+within roughly the same portion of the code where the rest of the model
+performs the initialization of other components. ::
+
+    ! Import SmartRedis modules
+    use, only smartredis_client : client_type
+
+    ! Declare a new variable called client and a string to create a unique
+    ! name for names
+    type(client_type) :: smartredis_client
+    character(len=7)  :: name_prefix
+    integer :: mpi_rank, mpi_code, smartredis_code
+
+    ! Note adding use_cluster as an additional runtime argument for SmartRedis
+    call initialize_model(temperature, number_of_timesteps, use_cluster)
+    call smartredis_client%initialize(use_cluster)
+    call MPI_Comm_rank(MPI_COMM_WORLD, mpi_rank, mpi_code)
+    ! Build the prefix for all tensors set in this model
+    write(name_prefix,'(I6.6,A)') mpi_rank, '_'
+
+    ! Assume all ranks will use the same machine learning model, so no need to
+    ! add the prefix to the model name
+    if (mpi_rank==0) call set_model_from_file("example_model_name", "path/to/model.pt", "TORCH", "gpu")
+
+Next, add the calls in the main loop to send the temperature to the orchestrator ::
+
+    character(len=30), dimension(1) :: model_input, model_output
+
+    main_loop: do i=1,number_of_timesteps
+
+        ! Write the current state of the simulation to a file
+        call write_current_state(temperature)
+        model_input(1) = name_prefix//"temperature"
+        model_output(1) = name_prefix//"temperature_out"
+        call smartredis_client%put_tensor(model_input(1), temperature)
+
+        ! Run the machine learning model
+        return_code = smartredis_client%run_model("example_model_name", model_input, model_output)
+        ! The following line overwrites the prognostic temperature array
+        return_code = smartredis_client%unpack_tensor(model_output(1), temperature)
+
+        ! Call a time integrator to step the temperature field forward
+        call timestep_simulation(temperature)
+
+    enddo
+
+This model will now use the client every timestep to put a
+temperature array in the orchestrator, instruct the orchestrator to call
+a machine learning model for prediction/inference, and unpack the resulting
+inference into the existing temperature array. For more complex examples,
+please see some of the integrations in the SmartSim Zoo or feel free to
+contact the team at [email protected]
diff --git a/tutorials/getting_started/getting_started.ipynb b/tutorials/getting_started/getting_started.ipynb
@@ -13,10 +13,9 @@
     " - Running and Communicating with the Orchestrator\n",
     " - Ensembles using SmartRedis\n",
     "\n",
-    "\n",
     "## Experiments and Models \n",
     "\n",
-    "`Experiment`s are how users define workflows in SmartSim. The `Experiment` is used to create `Model` instances which represent applications, scripts, or largely any program. An experiment can start and stop a `Model` and monitor execution.\n"
+    "`Experiment`s are how users define workflows in SmartSim. The `Experiment` is used to create `Model` instances which represent applications, scripts, or generally a program. An experiment can start and stop a `Model` and monitor execution.\n"
    ]
   },
   {
@@ -32,7 +31,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The next step is to initialize an `Experiment` instance. The `Experiment` must be provided a name. This can be any string, but it is best practice to give it a meaningful name as a broad title for what types of models the experiment will be supervising. For our purposes, our `Experiment` will be named `\"getting-started\"`.\n",
+    "The next step is to initialize an `Experiment` instance. The `Experiment` must be provided a name. This name can be any string, but it is best practice to give it a meaningful name as a broad title for what types of models the experiment will be supervising. For our purposes, our `Experiment` will be named `\"getting-started\"`.\n",
     "\n",
     "The `Experiment` also needs to have a `launcher` specified. Launchers provide SmartSim the ability to construct and execute complex workloads on HPC systems with schedulers (workload managers) like Slurm, or PBS. SmartSim currently supports\n",
     " * `slurm`\n",
@@ -65,7 +64,7 @@
     "\n",
     "Our first `Model` will simply print `hello` using the shell command `echo`.\n",
     "\n",
-    "`Experiment.create_run_settings` is used to create a `RunSettings` instance for our `Model`. `RunSettings` help parameterize *how* a `Model` should be executed provided the system and available computational resources.\n",
+    "`Experiment.create_run_settings` is used to create a `RunSettings` instance for our `Model`. `RunSettings` describe *how* a `Model` should be executed provided the system and available computational resources.\n",
     "\n",
     "`create_run_settings` is a factory method that will instantiate a `RunSettings` object of the appropriate type based on the `run_command` argument (i.e. `mpirun`, `aprun`, `srun`, etc). The default argument of `auto` will attempt to choose a `run_command` based on the available system software and the launcher specified in the experiment. If `run_command=None` is provided, the command will be launched without one."
    ]
@@ -311,7 +310,7 @@
    "source": [
     "## Ensembles\n",
     "\n",
-    "In the previous example, the two `Model` instances were created separately. There are more convenient ways of doing this, through `Ensemble`s. `Ensemble`s are groups of `Model` instances that can be treated as a single reference. We start by specifying `RunSettings` similar to how we did with our `Model`s."
+    "In the previous example, the two `Model` instances were created separately. The `Ensemble` SmartSim object is a more convenient way of setting up multiple models, potentially with different configurations. `Ensemble`s are groups of `Model` instances that can be treated as a single reference. We start by specifying `RunSettings` similar to how we did with our `Model`s."
    ]
   },
   {
@@ -1138,13 +1137,6 @@
    "source": [
     "exp.stop(db)"
    ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": []
   }
  ],
  "metadata": {

diff --git a/tutorials/ml_inference/Inference-in-SmartSim.ipynb b/tutorials/ml_inference/Inference-in-SmartSim.ipynb
@@ -9,7 +9,7 @@
     "\n",
     "This tutorial shows how to use trained PyTorch, TensorFlow, and ONNX (format) models, written in Python, directly in HPC workloads written in Fortran, C, C++ and Python.\n",
     "\n",
-    "The examples simulation here is written in Python for brevity, however, the inference API in SmartRedis is the same (besides extra parameters for compiled langauges) across all clients. Examples comparing the usage of the same model across SmartRedis client langauges can be found (put in link).\n"
+    "The example simulation here is written in Python for brevity, however, the inference API in SmartRedis is the same (besides extra parameters for compiled langauges) across all clients. \n"
    ]
   },
   {
@@ -503,7 +503,7 @@
     "### Setting TensorFlow and Keras Models\n",
     "\n",
     "After a model is created (trained or not), the graph of the model is\n",
-    "frozen saved to file so the client method `client.set_model_from_file`\n",
+    "frozen and saved to file so the client method `client.set_model_from_file`\n",
     "can load it into the database.\n",
     "\n",
     "SmartSim includes a utility to freeze the graph of a TensorFlow or Keras model in\n",
@@ -607,7 +607,7 @@
     "\n",
     "\n",
     "K-means clustering is an unsupervised ML algorithm. It is used to categorize data points\n",
-    "into f groups (\"clusters\"). Scikit Learn has a built in implementation of K-means clustering\n",
+    "into functional groups (\"clusters\"). Scikit Learn has a built in implementation of K-means clustering\n",
     "and it is easily converted to ONNX for use with SmartSim through \n",
     "[skl2onnx.to_onnx](http://onnx.ai/sklearn-onnx/auto_examples/plot_convert_syntax.html)\n",
     "\n",
@@ -769,7 +769,8 @@
     "on the same compute hosts an a Model instance defined by the user. In this\n",
     "deployment, the database is not connected together in a cluster and each shard\n",
     "of the database is addressed individually by the processes running on that compute\n",
-    "host.\n",
+    "host. This is particularly important for GPU-intensive workloads which require\n",
+    "frequent communication with the database.\n",
     "\n",
     "<img src=\"https://www.craylabs.org/docs/_images/co-located-orc-diagram.png\" alt=\"lattice\" width=\"600\"/>\n"
    ]

diff --git a/tutorials/online_analysis/lattice/online_analysis.ipynb b/tutorials/online_analysis/lattice/online_analysis.ipynb