-
Notifications
You must be signed in to change notification settings - Fork 32
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
1. Adding README file for the docker runner. 2. Removing old content. Adding one-sentence description and a reference to the documentation pages folder.
- Loading branch information
1 parent
3b88f96
commit 9a9928c
Showing
3 changed files
with
11 additions
and
121 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
# MLCommons-Box Docker Runner | ||
MLCommons-Box Docker Runner runs boxes (packaged Machine Learning (ML) workloads) in the docker environment. Read | ||
Docker Runner documentation [here](../../docs/runners/docker-runner.md). | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,65 +1,4 @@ | ||
# Singularity Runner | ||
Singularity runner uses Singularity runtime for running MBoxes. It supports two commands - `configure` and `build`: | ||
- `configure`: build Singularity container on a machine where it runs. | ||
- `run`: run MLBox in a Singularity container build at configure phase. | ||
|
||
It follows the [MLBox Runners Specification v0.1](https://docs.google.com/document/d/1bL8bsAam71Ex8GI_6mQ59QlVf8RkVW-GvsbPweAd8C8) | ||
to implement the command line interface: | ||
```shell script | ||
python -m mlbox_singularity_run configure --mlbox=MLBOX_PATH --platform=PLATFORM_FILE_PATH --task=TASK_FILE_PATH | ||
``` | ||
where: | ||
- `MLBOX_PATH` is the path to the MLBox root directory. | ||
- `PLATFORM_FILE_PATH` is the path to the Singularity platform definition file. This file is usually located in the | ||
`platform` sub-directory of the MLBox. | ||
- `TASK_FILE_PATH` is the path to the task run file. This file is usually located in the `run` sub-directory of the | ||
MLBox. | ||
|
||
### Singularity Platform Definition File | ||
The schema definition file is the part of the Singularity runner and is located in the source directory of the project: | ||
[mlbox-singularity.yaml](mlbox_singularity_run/mlbox-singularity.yaml). Onle one parameter is defined in the schema - | ||
`image`. It is path to singularity image. It is relative to MLBOX_ROOT/workspace: | ||
- By default, containers are stored in `$MLBOX_ROOT/workspace` if image is a file name. | ||
- If it is a relative path, it is relative to `$MLBOX_ROOT/workspace`. | ||
- Absolute paths (starting with `/`) are used as is. | ||
|
||
One drawback of using workspace directory is that when using SSH runner, this directory is synchronized between remote | ||
and local hosts what results in transferring the Singularity image back to user's local machine which may not be | ||
desirable behavior. | ||
|
||
If path to image does not exist, singularity runner will attempt to create one. | ||
|
||
### Configure | ||
Configure command line interface requires two mandatory arguments - `mlbox` and `platform`: | ||
```shell script | ||
python -m mlbox_singularity_run configure --mlbox=MLBOX_PATH --platform=PLATFORM_FILE_PATH | ||
``` | ||
|
||
### End to end example using MNIST MLBOX and GitHub source tree | ||
- Setup and activate python virtual environment, install runner requirements | ||
```shell script | ||
virtualenv -p python3.8 ./env | ||
source ./env/bin/activate | ||
pip install typer mlspeclib | ||
export PYTHONPATH=$(pwd)/mlcommons_box:$(pwd)/runners/mlbox_singularity_run | ||
``` | ||
|
||
- Specify path to the Singularity image. In MNIST MLBox (`platform/singularity.yaml`), the path is | ||
`/opt/singularity/mlperf_mlbox_mnist-0.01.simg`. Either change it, or make sure the `/opt/singularity` the directory | ||
exists and writable, or users have permission to create one. | ||
|
||
- Setup environment. This is optional step. | ||
```shell script | ||
export https_proxy=${http_proxy} | ||
``` | ||
|
||
- Configure MNIST MLBox. | ||
```shell script | ||
python -m mlbox_singularity_run configure --mlbox=examples/mnist --platform=examples/mnist/platform/singularity.yaml | ||
``` | ||
|
||
- Run two tasks - `download` (download data) and `train` | ||
```shell script | ||
python -m mlbox_singularity_run run --mlbox=examples/mnist --platform=examples/mnist/platform/singularity.yaml --task=examples/mnist/run/download.yaml | ||
python -m mlbox_singularity_run run --mlbox=examples/mnist --platform=examples/mnist/platform/singularity.yaml --task=examples/mnist/run/train.yaml | ||
``` | ||
# MLCommons-Box Singularity Runner | ||
MLCommons-Box Singularity Runner runs boxes (packaged Machine Learning (ML) workloads) in the singularity environment. | ||
Read Singularity Runner documentation [here](../../docs/runners/singularity-runner.md). | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,56 +1,3 @@ | ||
# MLBox SSH runner | ||
|
||
This is the port of the SHH runner from this [repository](https://github.com/sergey-serebryakov/mlbox/tree/feature/mlbox_runners_v2) | ||
|
||
__Porting Status__ | ||
- Support for new MLBox directory structure. | ||
- Still runs only docker MLBoxes. Configure phase works, the run phase does not because of different directory tree. | ||
|
||
## Pre-requisites | ||
1. Remote server available via SSH/rsync. | ||
2. Password-less login with public/private keys. If it's not the case, the ssh runner will be asking for a password | ||
several times (quite annoying). | ||
3. Remote server must provide python interpreter with ssh runner requirements (`mlspeclib`, `typer`). It can be either | ||
a system python or user python (virtualenv, conda etc.). Full path to the python executable must be known. | ||
4. Python version should be at least 3.5 (or maybe 3.6 - type annotations are used). | ||
|
||
## Current limitations (may not be the complete list) | ||
1. Only docker-based MLBoxes are supported. So, remote host must provide docker/nvidia-docker. | ||
2. Task files have the following path `run/{task_name}.yaml` (relative to MLBox root path). | ||
3. MLBox contains the `build` directory with Dockerfile. Images are built, not pulled. | ||
4. The whole `workspace` directory is synced with the local host after each task execution. | ||
5. The SHH runner was tested with MNIST MLBox only. | ||
|
||
> Pretty much all of these can easily be solved. In fact, the original repo do not have the first three limitations. | ||
## How to use SSH runner using MNIST example MLBox | ||
1. Copy the `ssh.yaml` platform file in the `platform` directory of a MLBox. | ||
`platform.yaml`. | ||
2. Edit `platform.yaml`. Change the following fields: | ||
- `host`: IP address of the remote host | ||
- `user`: User name to use | ||
- `env->interpreter->python`: Python interpreter to use on a remote host to run SSH runner. If it's a custom | ||
installation, specify the absolute path. | ||
- `env->variables`: Dictionary of environmental variables to use. Will be used for docker build/run. In some cases, | ||
_http_proxy_ and _https_proxy_ variables need to be set. | ||
3. Configure remote host (copy mlbox runners, mlbox, build docker image) - run in the mlbox root directory: | ||
```shell script | ||
export PYTHONPATH=$(pwd)/mlcommons_box:$(pwd)/runners/mlbox_singularity_run:$(pwd)/runners/mlbox_ssh_run | ||
|
||
python -m mlbox_ssh_run configure --mlbox=examples/mnist --platform=examples/mnist/platform/ssh.yaml | ||
``` | ||
4. Run two tasks - download data and model training. After each task execution, local mlbox workspace directory | ||
will contain tasks' output artifacts (data sets, log files, models etc.): | ||
```shell script | ||
python -m mlbox_ssh_run run --mlbox=examples/mnist --platform=examples/mnist/platform/ssh.yaml --task=examples/mnist/run/download.yaml | ||
|
||
python -m mlbox_ssh_run run --mlbox=examples/mnist --platform=examples/mnist/platform/ssh.yaml --task=examples/mnist/run/train.yaml | ||
``` | ||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
# MLCommons-Box SSH Runner | ||
MLCommons-Box SSH Runner runs boxes (packaged Machine Learning (ML) workloads) in the remote environment. Read | ||
SSH Runner documentation [here](../../docs/runners/ssh-runner.md). |