CrayLabs · ashao · Jun 22, 2022 · Jun 8, 2022 · Jun 20, 2022 · Jun 21, 2022
diff --git a/doc/sr_integration.rst b/doc/sr_integration.rst
@@ -7,9 +7,9 @@ Overview
 ========
 
 This document provides some general guidelines to integrate the SmartRedis client
-into existing simulation codebase. Developers of these simulation codebases will
-need to identify the exact places to add the code, generally only SmartRedis
-calls will need to be added in two places:
+into existing simulation codebases. Developers of these simulation codebases will
+need to identify the exact places to add the code; generally SmartRedis calls
+will only need to be added in two places:
 
 1. Initialization
 2. Main loop
@@ -23,9 +23,9 @@ Creating the client
 +++++++++++++++++++
 
 The SmartRedis client must be initialized before it can be used to communicate
-with the database. In the C++ and Python versions of the clients, this is done
-when creating a new client whereas in the C and Fortran client an `initialize`
-method that must be called.
+with the orchestrator. In the C++ and Python versions of the clients, this is done
+when creating a new client. In the C and Fortran client an `initialize`
+method must be called.
 
 C++::
 
@@ -40,95 +40,110 @@ Python::
 Fortran::
 
     use smartredis_client, only : client_type
+    include "enum_fortran.inc"
+
     type(client_type) :: client
     return_code = client%initialize(use_cluster)
+    if (return_code /= SRNoError) stop 'Error in initialization'
 
 C::
 
     #include "client.h"
+    #include "sr_enums.h"
     void* client = NULL;
-    return_code = SmartRedisCClient(use_cluster), &client
+    return_code = SmartRedisCClient(use_cluster, &client)
+    if (return_code != SRNoError) {
+        return -1
+    }
 
-All these methods only have one configurable parameter, indicated in the
-above cases by the variable `use_cluster`. If this parameter, is true,
-then the client expects to be able to communicate with a database with
+All these methods only have one configurable parameter indicated in the
+above cases by the variable `use_cluster`. If this parameter is true,
+then the client expects to be able to communicate with an orchestrator with
 three or more shards.
 
-++++++++++++++++++++++++++++++++++
-(Parallel Programs): Creating keys
-++++++++++++++++++++++++++++++++++
+++++++++++++++++++++++++++++++++++++++++++
+(Parallel Programs): Creating unique names
+++++++++++++++++++++++++++++++++++++++++++
 
 For parallel applications, each rank or thread that is communicating with
-the database will likely need to create a unique prefix for keys to prevent
-another rank or thread inadvertently overwriting data. Any identifier can be
-used, though typically the MPI rank number (or equivalent identifier) is a
-useful, unique number.
+the orchestrator will likely need to create a unique prefix for names to prevent
+another rank or thread inadvertently overwriting data. This prefix should be used
+for when creating the name of a tensor, dataset, and model that needs to be unique
+to a given rank. (Note: for models run within SmartSim, Additional prefixing may
+done by the client when running an ensemble and/or multiple data sources).
+Any identifier can be used, though typically the MPI rank number (or equivalent
+identifier) is a useful, unique number.
 
 C++::
 
-    const std::string key_prefix << std::format("{:06}_", *rank_id);
+    const std::string name_prefix << std::format("{:06}_", *rank_id);
 
 Python::
 
-    key_prefix = f"{rank_id:06d}_"
+    name_prefix = f"{rank_id:06d}_"
 
 Fortran::
 
-    character(len=12) :: key_prefix
-    write(key_prefix,'(A,I6.6)') rank_id
+    character(len=12) :: name_prefix
+    write(name_prefix,'(A,I6.6)') rank_id
 
 C::
 
-    char[7] key_prefix;
-    key_prefix = sprintf(key_prefix", "%06d" rank_id);
+    char[7] name_prefix;
+    name_prefix = sprintf(name_prefix", "%06d\0", *rank_id);
 
 ++++++++++++++++++++++++++
 Storing scripts and models
 ++++++++++++++++++++++++++
 
 The last task that typically needs to be done is to store models or scripts
-that will be used later in the simulation. When using a clustered database,
-this only needs to be done by one client (unless ranks required a specific
-model or script).
+that will be used later in the simulation. When using a clustered orchestrator,
+this only needs to be done by one client (unless each rank requires a different
+model or script). MPI rank 0 is often a convenient choice to set models and
+scripts.
 
 C++::
 
     if (root_client) {
-        client.set_model_from_file(model_key, model_file, backend, device)
+        client.set_model_from_file(model_name, model_file, backend, device)
     }
 
 Python::
 
     if root_client:
-        client.set_model_from_file(model_key, model_file, backend, device)
+        client.set_model_from_file(model_name, model_file, backend, device)
 
 Fortran::
 
-    if (root_client) return_code = client%set_model_from_file(model_key, model_file, "TORCH", "CPU")
+    if (root_client) return_code = client%set_model_from_file(model_name, model_file, backend, device)
+    if (return_code /= SRNoError) stop 'Error setting model'
 
 C::
 
     if (root_client) {
-        return_code = client.set_model_from_file(client, model_key, model_file, backend, device)
+        return_code = client.set_model_from_file(client, model_name, model_file, backend, device)
+        if (return_code != SRNoError) {
+            return -1
+        }
     }
 
 =========
 Main loop
 =========
 
 Within the main loop of the code (e.g. every timestep or iteration of a solver),
-the developer uses the SmartRedis client methods to implement a workflow which
+the developer typically uses the SmartRedis client methods to implement a workflow which
 may include receiving data, sending data, running a script or model, and/or
 retrieving a result. These workflows are covered extensively in the walkthroughs
 for the Fortran, C++, and python clients and the integrations with MOM6, OpenFOAM,
 LAMMPS, and others.
 
-Generally though, developers are advised to
+Generally though, developers are advised to:
 
-1. Find locations where file I/O would normally happen and either replace
-   or add code to use the SmartRedis client and store the data in the
-   database
-2. Use the `key_prefix` created during initialization to avoid accidental
+1. Find locations where file I/O would normally happen and either augment
+   or replace code to use the SmartRedis client and store the data in the
+   orchestrator
+2. Use the `name_prefix` created during initialization to avoid accidental
    writes/reads from different clients
 3. Use the SmartSim `dataset` type when using clients representing decomposed
    subdomains to make the retrieval/use of the data more performant
@@ -139,14 +154,14 @@ Full example
 
 The following pseudocode is used to demonstrate various aspects of instrumenting an
 existing simulation code with SmartRedis. This code is representative of solving
-the time-evolving heat equation. but will be augmented using an ML model to
+the time-evolving heat equation. but we will augment it using an ML model to
 provide a preconditioning step each iteration and post the state of the simulation
-to the database. ::
+to the orchestrator. ::
 
     program main
 
         ! Initialize the model, setup MPI, communications, read input files
-        call initialize_model( temperature, number_of_timesteps )
+        call initialize_model(temperature, number_of_timesteps)
 
         main_loop: do i=1,number_of_timesteps
 
@@ -161,40 +176,43 @@ to the database. ::
 
 Following the guidelines from above, the first step is to initialize the client
 and create a unique identifier for the given processor. This should be done
-within roughly the same portion of the code where the rest of the model. ::
+within roughly the same portion of the code where the rest of the model
+performs the initialization of other components. ::
 
     ! Import SmartRedis modules
     use, only smartredis_client : client_type
 
     ! Declare a new variable called client and a string to create a unique
-    ! name for keys
+    ! name for names
     type(client_type) :: smartredis_client
-    character(len=7)  :: key_prefix
+    character(len=7)  :: name_prefix
     integer :: mpi_rank, mpi_code, smartredis_code
 
     ! Note adding use_cluster as an additional runtime argument for SmartRedis
     call initialize_model(temperature, number_of_timesteps, use_cluster)
     call smartredis_client%initialize(use_cluster)
     call MPI_Comm_rank(MPI_COMM_WORLD, mpi_rank, mpi_code)
-    write(key_prefix,'(I6.6,A)') mpi_rank, '_'
+    ! Build the prefix for all tensors set in this model
+    write(name_prefix,'(I6.6,A)') mpi_rank, '_'
 
-    ! Assume all ranks will use the same machine learning model
-    if (mpi_rank==0) call set_model_from_file("example_model_key", "path/to/model.pt", "TORCH", "gpu")
+    ! Assume all ranks will use the same machine learning model, so no need to
+    ! add the prefix to the model name
+    if (mpi_rank==0) call set_model_from_file("example_model_name", "path/to/model.pt", "TORCH", "gpu")
 
-Next, add the calls in the main loop to send the temperature to the database ::
+Next, add the calls in the main loop to send the temperature to the orchestrator ::
 
-    character(len=10), dimension(1) :: model_input, model_output
+    character(len=30), dimension(1) :: model_input, model_output
 
     main_loop: do i=1,number_of_timesteps
 
         ! Write the current state of the simulation to a file
         call write_current_state(temperature)
-        model_input(1) = key_prefix//"temperature"
-        model_output(1) = key_prefix//"temperature_out"
-        call smartredis_client%put_tensor(model_input(1))
+        model_input(1) = name_prefix//"temperature"
+        model_output(1) = name_prefix//"temperature_out"
+        call smartredis_client%put_tensor(model_input(1),temperature)
 
         ! Run the machine learning model
-        return_code = smartredis_client%run_model("example_model_key", model_input, model_output)
+        return_code = smartredis_client%run_model("example_model_name", model_input, model_output)
         ! The following line overwrites the prognostic temperature array
         return_code = smartredis_client%unpack_tensor(model_output(1), temperature)
 
@@ -203,9 +221,9 @@ Next, add the calls in the main loop to send the temperature to the database ::
 
     enddo
 
-Now when this program runs, every time step the client will be used to the
-temperature array in the database, the database will call a machine learning
-model to do the inference, the simulation will request the inference,
-and finally unpack the array into the existing temperature array. For more
-complex examples, please see some of the integrations in the SmartSim Zoo or
-feel free to contact the team at [email protected]
+This model will now use the client every timestep to put a
+temperature array in the orchestrator, instruct the orchestrator to call
+a machine learning model for prediction/inference, and unpack the resulting
+inference into the existing temperature array. For more complex examples,
+please see some of the integrations in the SmartSim Zoo or feel free to
+contact the team at [email protected]
diff --git a/tutorials/getting_started/getting_started.ipynb b/tutorials/getting_started/getting_started.ipynb
@@ -31,7 +31,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The next step is to initialize an `Experiment` instance. The `Experiment` must be provided a name which can be any string, but it is best practice to give it a meaningful name as a broad title for what types of models the experiment will be supervising. For our purposes, our `Experiment` will be named `\"getting-started\"`.\n",
+    "The next step is to initialize an `Experiment` instance. The `Experiment` must be provided a name. This name can be any string, but it is best practice to give it a meaningful name as a broad title for what types of models the experiment will be supervising. For our purposes, our `Experiment` will be named `\"getting-started\"`.\n",
     "\n",
     "The `Experiment` also needs to have a `launcher` specified. Launchers provide SmartSim the ability to construct and execute complex workloads on HPC systems with schedulers (workload managers) like Slurm, or PBS. SmartSim currently supports\n",
     " * `slurm`\n",

diff --git a/tutorials/ml_inference/Inference-in-SmartSim.ipynb b/tutorials/ml_inference/Inference-in-SmartSim.ipynb
@@ -9,7 +9,7 @@
     "\n",
     "This tutorial shows how to use trained PyTorch, TensorFlow, and ONNX (format) models, written in Python, directly in HPC workloads written in Fortran, C, C++ and Python.\n",
     "\n",
-    "The examples simulation here is written in Python for brevity, however, the inference API in SmartRedis is the same (besides extra parameters for compiled langauges) across all clients. \n"
+    "The example simulation here is written in Python for brevity, however, the inference API in SmartRedis is the same (besides extra parameters for compiled langauges) across all clients. \n"
    ]
   },
   {
-Original file line number
+Diff line change
@@ Expand Up / @@ -9,7 +9,7 @@ @@
         "\n",
         "This tutorial shows how to use trained PyTorch, TensorFlow, and ONNX (format) models, written in Python, directly in HPC workloads written in Fortran, C, C++ and Python.\n",
         "\n",
-        "The examples simulation here is written in Python for brevity, however, the inference API in SmartRedis is the same (besides extra parameters for compiled langauges) across all clients. \n"
+        "The example simulation here is written in Python for brevity, however, the inference API in SmartRedis is the same (besides extra parameters for compiled langauges) across all clients. \n"
        ]
       },
       {
@@ Expand Down @@