The MLflow Batch Scoring Pipeline is an MLflow Pipeline for applying a registered MLflow model to a specified dataset.
For more information about the MLflow Batch Scoring Pipeline, check out the documentation at https://mlflow.org/docs/latest/pipelines.html#batch-scoring-pipeline. For more information about MLflow Pipelines, see https://mlflow.org/docs/latest/pipelines.html.
- Install MLflow Pipelines:
pip install mlflow[pipelines]
- Clone the MLflow Batch Scoring Pipeline template repository locally:
git clone https://github.com/mlflow/mlp-batch-scoring-template.git.
- Enter the root directory of the MLflow Batch Scoring Pipeline template:
cd mlp-batch-scoring-template
- Install MLflow Batch Scoring Pipeline dependencies:
pip install -r requirements.txt
Sync this repository with
Databricks Repos and run the notebooks/databricks
notebook on a Databricks Cluster running version 11.0 or greater of the
Databricks Runtime or the
Databricks Runtime for Machine Learning
with workspace files support enabled.
Note: When making changes to pipelines on Databricks, it is recommended that you either edit files on your local machine and use dbx to sync them to Databricks Repos, as demonstrated here, or edit files in Databricks Repos by opening separate browser tabs for each YAML file or Python code module that you wish to modify.
For the latter approach, we recommend opening at least 3 browser tabs to facilitate easier development:
- One tab for modifying configurations in
pipeline.yaml
and / orprofiles/{profile}.yaml
- One tab for modifying step function(s) defined in
steps/{step}.py
- One tab for modifying and running the driver notebook (
notebooks/databricks
)
- Launch the Jupyter Notebook environment via the
jupyter notebook
command. - Open and run the
notebooks/jupyter.ipynb
notebook in the Jupyter environment.
First, enter the template root directory via cd mlp-batch-scoring-template
. Then, try running the
following MLflow CLI commands to get started. Note that
the --step
argument is optional; pipeline commands that are run without a --step
act on
the entire pipeline.
export MLFLOW_PIPELINES_PROFILE=local
mlflow pipelines --help
mlflow pipelines inspect --step step_name
mlflow pipelines run --step step_name
mlflow pipelines clean --step step_name