- Overview
- Installing and running the CLI
- ARISE in action on sample data
- Running ARISE from the UI - Work in Progress
- More on data requirements
- Known tool issues
AI Right Sizing Engine (ARISE) is a tool for predicting required resources and execution time of an AI workload, based on historical executions or performance benchmarks of similar workloads (a workload dataset). ARISE is intended to support configuration decision-making for platform engineers or data scientists operating with AI stacks.
ARISE parses and preprocesses the given workloads dataset into a standard format, provides descriptive statistics,
trains predictive models, and performs predictions based on the models. See Instructions for running the CLI for
details on the commands to invoke the above operations. To use these commands, in addition to the workload dataset, you
need to provide in your input path a job_spec.yaml
file indicating the metadata inputs and outputs of your data.
See this example of a job spec.
Clone the repo or download codebase zip
Install the CLI
To install the CLI in a virtual environment (this would be the preferred installation mode to keep the installation isolated and avoid version conflicts), run the commands:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Windows users should run:
python3 -m venv venv
pip install -r requirements.txt
From the project root directory:
python -m unittest -v
To see the log messages for failing tests, use the buffer command (and use
in your test case). See tests.test_analyze.py
as an example.
python -m unittest -v --buffer
Running the tests on a single test case:
python -m unittest -v --buffer tests/test_build_models.py
There are four supported commands:
provides descriptive statistics on the metadata inputs (workload measurements) and generates a number of spreadsheets and plots in a subdirectory calledjob-analysis
. The data should be provided in a folder calleddata
in the giveninput-path
python -m arise_predictions.main analyze-jobs --input-path examples/MLCommons
It is also possible to specify the job spec file and metadata input explicitly:
python -m arise_predictions.main.py analyze-jobs --input-path examples/MLCommons --reread-history --job-spec-file-name job_spec.yaml --input-file inference_data_tokens.csv --custom-job-name inference-thpt
In the above example, we also specify a custom job name. In this example data
set there is no column capturing the job id. If there were, we could provide it in the --job-id-column
argument. With
, we instruct the code to insert such a column with the given job name as values.
This tends to improve the output of the descriptive job analysis (e.g., labels in plots).
performs a hyperparameter search over the models and parameter space specified in a configuration file (cf.,config/default-auto-model-search-config.yaml
) and finds the best model and its hyperparameter settings for each target variable in the data. It attempts to build one best model per target variable in the metadata outputs based on the metadata inputs.
python -m arise_predictions.main auto-build-models --input-path examples/MLCommons --reread-history
The output models, their relative ranking, and the cross validation results are all stored in a folder named
which is created in the given input path. If you run with the flag --single-output-file
, the
models and results will be archived into a single output file ARISE-auto-models.zip
in the given input path.
If you do not specify an option for --config-file
, it uses the default one in
. There is a config file that
defines a much smaller parameter search space and hence completes in a shorter
time. You can make use of it like this:
python -m arise_predictions.main auto-build-models --input-path examples/MLCommons --reread-history --config-file config/small-auto-model-search-config.yaml
If you are running on your local machine, it is advised to limit the number of processors used. However, this will result in a much longer run. To build models using 2 processors only, use this command:
python -m arise_predictions.main --num-jobs 2 auto-build-models --input-path examples/MLCommons --reread-history --config-file config/small-auto-model-search-config.yaml
By default, auto-build-models
performs 10-fold cross validation. If you want to perform instead leave-one-group-out (logo)
cross validation, add the cflag --leave-one-out-cv
, which takes as an argument a list of one or more feature names to
group py, separated by commas.
For example, the following command will build models using logo cross validation on values of LLM name. That is, in each iteration it will use a specific LLM as the test set, and all other data as the training set.
python -m arise_predictions.main auto-build-models --input-path examples/MLCommons --leave-one-out-cv "Model MLC"
In addition to the above, we can also let auto-build-models
search for models that are tuned for
extrapolation. That is, you can let ARISE build a model that performs
relatively well when asked to predict on inputs that are outside the range of
values seen for this feature during training. This is an experimental feature
whose performance we expect to improve with time. Currently, only a single
extrapolated feature is supported and the training data needs a number of
different data points or levels for this feature to have an opportunity to learn
to extrapolate.
You specify the name of the input feature on which to extrapolate
) as well as a low and or high threshold (--low-threshold
) which define the extrapolation region to train on. The
thresholds should be chosen from within the range of values that exist in the
training data, so that ARISE can define regions used for training and for
testing the extrapolation performance of the resulting models. For example:
python -m arise_predictions.main auto-build-models --input-path examples/MLCommons --reread-history --feature-column "# of Accelerators" --high-threshold 8
generates estimated values for metadata outputs given metadata input values. It should be run afterauto-build-models
command and uses its output. The--model-path
flag is where the models created byauto-build-models
are located. Thejob_spec.yaml
file should be under the--input-path
. Predict requires to specify a model name and input space configuration. It generates the space of input features according to the configuration and uses models previously built withauto-build-models
to run predictions on this input space for the target variables indicated in the same configuration file.
An example configuration file: example-demo-mlcommons-config.yaml.
In addition to feature values, the config file requires specifying target variables for prediction. For each target
variable, the boolean parameter indicating whether greater_is_better
should be specified. estimator_file
is an
optional parameter providing the name of the model file to use for predictions of this target variable. If it is not
provided, ARISE automatically uses the top-ranked model file according to the auto-build-models
results which should
be located in the provided model path, next to the persisted model files.
python -m arise_predictions.main predict --input-path examples/MLCommons --config-file
config/example-demo-mlcommons-config.yaml --model-path examples/MLCommons/ARISE-auto-models
The input space defined by the configuration file and ARISE predictions for each input combination in this space are
stored in a folder named ARISE-predictions
which is created in the given input path.
is a version of predict that facilitates demos by ranking predictions and comparing predictions with ground truth where available.The
should point historic or benchmark input data sodemo-predict
could compare predictions with available ground truth (as far as is possible). The script needs to have the path to the directory containing the serialized models built byauto-build-models
. Other parameters are taken from the configuration file.
python -m arise_predictions.main demo-predict --input-path examples/MLCommons --config-file
config/example-demo-mlcommons-config.yaml --model-path examples/MLCommons/ARISE-auto-models
In addition to the outputs described for the predict
command, demo-predict
will also create a file named
, containing the predicted versus ground truth values and the resulting MAPE error,
for any input combination in the defined input space that appears also in the given ground truth data.
To use all the above commands, you need to provide in your input path a job_spec.yaml
file indicating the
metadata inputs and outputs of your data. See this example of a job
The default log level is DEBUG
. You can change by specifying a different log
level as in the following example:
python -m arise_predictions.main --loglevel info analyze-jobs
To run ARISE from the UI, see documentation here. Note that the UI is still work in progress and missing many features that are available from the CLI.
To see ARISE in action on a sample dataset, go here.
The data consists of historical workload executions and/or performance benchmarks. Examples of potential properties of workloads that can be considered:
- Input data size and data complexity-related properties
- Hyper-parameters
- Workload task
- GPU configuration
- Total execution time
- Throughput and latency
- Consumed resources: number of workers, CPU, GPU, and memory per worker
- Job status (success, fail/abort, etc.)
Example datasets can be found here and here.
The data is divided into job-metadata-inputs
: the properties of the workload that are known before it starts running
(e.g., items 1-5 above), and job-metadata-outputs
: properties of the workload execution and output that are known only
once the workload completes (e.g., items 6-9 above). The inputs and outputs specification is provided in the
file. See this example of a job spec.
If the format of your data requires special parsing to transform into a dataframe (i.e., beyond a simple csv file), you
can implement your own parser in this class. For example, the sentiment
analysis example (here) uses SAJsonJobParser
as its parser, since its original
data consists of a json file per workload execution. The name of your parser should be provided in the
optional field in job_spec.yaml
, see here.
- Currently, the tool uses exhaustive grid search for hyperparameter optimization (HPO). This may result in long run time for large datasets. We plan to move to a sample-based HPO that will scale the model search phase.
- Extrapolation is still work in progress, hence currently we expect large errors when predicting outputs for input values which are far beyond the range provided in the training dataset.