Maestro is a simple orchestration tool implemented with pure Python and its standard libraries. It executes a series of steps defined in a workflow specification.
DAGs can be defined as a JSON file with four main fields:
{
"name": "...",
"inputs": {"..."},
"steps": ["..."],
"outputs": {"..."}
}
All of them are relatively self-explanatory: you must define the workflow's name, its inputs (as key/value pairs, so they can be referenced later), its steps, and its outputs (which are also named and must be reference variables or literal values).
Steps inside the "steps"
array must be defined with the following specification:
{
"name": "...",
"type": "...",
"path": "...",
"depends_on": ["..."],
"inputs": {"..."},
"outputs": ["..."]
}
The "type"
field informs which kind of step it is, while the "path"
field is used by
the orchestrator to resolve the executable's location. The "depends_on"
field is optional
and, if passed, will be used to block the step's execution while the required steps are
not yet complete.
Currently the only supported step type is python_step
. In this type, the "path"
field
must be a fully qualified Python function path. For instance, the function pow()
present
in Python's math
module would be called with math.pow
. The inputs are passed as
sequential arguments for the function. If you want to use user-defined functions, you must
organize your project as a Python module and execute Maestro outside of it (as you can see
in the Examples section).
You can pass an entity's input/output to another step using the reference format
{{ <entity>.<"inputs" or "outputs">.<variable> }}
. For instance, the following
spec will use the value of x
specified as an input of a workflow named my_workflow
:
{
"inputs": {
"x": "{{ my_workflow.inputs.x }}"
},
}
This means that names should be unique across an entire specification, so that reference
variables can be correctly resolved. Also, if a step requires an output of another step,
you must explictly define the "depends_on"
field.
Currently there is no package specification for the project (i.e. you can't install it
using pip), but since Maestro is implemented using only standard libraries, you can
"install" it by adding the maestro/
directory to a directory present in your Python PATH.
Maestro was developed and tested in Python 3.8.
To run the orchestrator, you must execute its main module while passing the path (absolute or relative) to a JSON file containing the workflow specification:
python -m maestro [WORKFLOW_PATH]
Inside the examples/
directory you can find examples of workflow definitions alongside
user-defined code that is used inside these workflows. It is recommended to run these
examples at the repository's root, not only because this way the maestro
library will
be resolved, but the examples
library as well.
For instance, if you run python -m maestro examples/workflows/compute_sum_of_squares.json
you'll get the following output:
===========================================
Workflow "compute_sum_of_squares"
===========================================
Inputs:
- x: 4
- y: 3
Steps:
- square_x: SUCCESSFUL
- square_y: SUCCESSFUL
- sum_squares: SUCCESSFUL
Outputs:
- sum_of_squares: 25