Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge from CTuning #1155

Merged
merged 35 commits into from
Mar 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
db011fe
Cleanups
arjunsuresh Mar 4, 2024
c705fc2
force redownload for large models
arjunsuresh Mar 4, 2024
8d22aa3
Use rclone sync by default
arjunsuresh Mar 4, 2024
905edc1
Added install numactl from src
arjunsuresh Mar 4, 2024
29b3c95
Merge branch 'mlcommons:master' into master
arjunsuresh Mar 4, 2024
61c23e2
Cleanup for gh install on rhel
arjunsuresh Mar 4, 2024
9512c33
Fix bug in docker build-arg
arjunsuresh Mar 5, 2024
42accf5
Add tabulate deps for docker (needed in case of fake-deps)
arjunsuresh Mar 5, 2024
ad9bce8
updated requirements
gfursin Mar 5, 2024
d671397
added "warnings" to the main CM script meta (for example to warn abou…
gfursin Mar 5, 2024
77300f9
added sudo warnings
gfursin Mar 5, 2024
9df1710
Make redownload the default if there is no checksum success
arjunsuresh Mar 5, 2024
5ab15a4
Make redownload the default if there is no checksum success
arjunsuresh Mar 5, 2024
d27e286
Make redownload the default if there is no checksum success
arjunsuresh Mar 5, 2024
e282102
added onnx image classification test
gfursin Mar 5, 2024
bdb5f37
Merge branch 'master' of https://github.com/ctuning/mlcommons-ck
gfursin Mar 5, 2024
10c3582
Make redownload the default if there is no checksum success
arjunsuresh Mar 5, 2024
90a3159
Put mlperf sut configs and descriptions in CM cache
arjunsuresh Mar 5, 2024
44d3387
Put mlperf sut configs and descriptions in CM cache
arjunsuresh Mar 5, 2024
6e26e12
added extra Python 3.12 tests
gfursin Mar 5, 2024
ef5393f
Merge branch 'master' of https://github.com/ctuning/mlcommons-ck
gfursin Mar 5, 2024
62ff146
Added MLPerf inference MIL C++ test
gfursin Mar 5, 2024
873311a
testing 3.11 instead of 3.12
gfursin Mar 5, 2024
f939247
fixed prebuilt llvm tags
gfursin Mar 5, 2024
cb3d2b5
test standalone Python loadgen with ONNX and BERT
gfursin Mar 5, 2024
c7063ef
Fixes for mlperf inference reference bert pytorch
arjunsuresh Mar 5, 2024
faf7a0e
forgot --quiet in loadgen test
gfursin Mar 5, 2024
1b6da8e
Merge branch 'master' of https://github.com/ctuning/mlcommons-ck
gfursin Mar 5, 2024
3111edb
adding new tests
gfursin Mar 5, 2024
7a02a14
Fix rnnt test
arjunsuresh Mar 5, 2024
d56f811
added missing dummy test
gfursin Mar 5, 2024
c68c726
Merge branch 'master' of https://github.com/ctuning/mlcommons-ck
gfursin Mar 5, 2024
5e73016
clean up
gfursin Mar 5, 2024
78a599e
print MLPerf reference source path
gfursin Mar 5, 2024
040029c
Merge branch 'master' of https://github.com/ctuning/mlcommons-ck
gfursin Mar 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/test-cm-script-features.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.10", "3.9", "3.8"]
python-version: ["3.12", "3.11", "3.10", "3.9", "3.8"]

steps:
- uses: actions/checkout@v3
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test-cm-scripts.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ["3.9"]
python-version: ["3.12", "3.9"]

steps:
- uses: actions/checkout@v3
Expand Down
36 changes: 36 additions & 0 deletions .github/workflows/test-image-classification-onnx.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: image classification with ONNX

on:
pull_request:
branches: [ "master", "dev" ]
paths:
- '.github/workflows/test-image-classification-onnx.yml'
- 'cm-mlops/**'
- '!cm-mlops/**.md'

jobs:
build:

runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: [ "3.12", "3.9"]

steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python3 -m pip install cmind
cm pull repo --url=${{ github.event.pull_request.head.repo.html_url }} --checkout=${{ github.event.pull_request.head.ref }}
cm run script --quiet --tags=get,sys-utils-cm
- name: Test image classification with ONNX
run: |
cmr "python app image-classification onnx" --quiet
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: MLPerf inference bert
name: MLPerf inference bert (deepsparse, tf, onnxruntime, pytorch)

on:
pull_request:
branches: [ "master", "dev" ]
paths:
- '.github/workflows/test-mlperf-inference-bert.yml'
- '.github/workflows/test-mlperf-inference-bert-deepsparse-tf-onnxruntime-pytorch.yml'
- 'cm-mlops/**'
- '!cm-mlops/**.md'

Expand All @@ -18,7 +18,8 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [ "3.9" ]
# 3.12 didn't work on 20240305 - need to check
python-version: [ "3.11", "3.9" ]
backend: [ "deepsparse", "tf", "onnxruntime", "pytorch" ]
precision: [ "int8", "fp32" ]
exclude:
Expand All @@ -38,6 +39,6 @@ jobs:
python3 -m pip install cmind
cm pull repo --url=${{ github.event.pull_request.head.repo.html_url }} --checkout=${{ github.event.pull_request.head.ref }}
cm run script --quiet --tags=get,sys-utils-cm
- name: Test MLPerf Inference Bert
- name: Test MLPerf Inference Bert (DeepSparse, TF, ONNX, PyTorch)
run: |
cm run script --tags=run,mlperf,inference,generate-run-cmds,_submission,_short --submitter="cTuning" --model=bert-99 --backend=${{ matrix.backend }} --device=cpu --scenario=Offline --test_query_count=5 --precision=${{ matrix.precision }} --target_qps=1 -v --quiet
2 changes: 1 addition & 1 deletion .github/workflows/test-mlperf-inference-gptj.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [ "3.9" ]
python-version: [ "3.12", "3.9" ]
backend: [ "pytorch" ]
precision: [ "bfloat16" ]

Expand Down
38 changes: 38 additions & 0 deletions .github/workflows/test-mlperf-inference-mil-cpp-resnet50.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: MLPerf inference MIL C++ ResNet50

on:
pull_request:
branches: [ "master", "dev" ]
paths:
- '.github/workflows/test-mlperf-inference-mil-cpp-resnet50.yml'
- 'cm-mlops/**'
- '!cm-mlops/**.md'

jobs:
build:

runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: [ "3.12", "3.9" ]
llvm-version: [ "15.0.6", "16.0.4", "17.0.6" ]

steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python3 -m pip install cmind
cm pull repo --url=${{ github.event.pull_request.head.repo.html_url }} --checkout=${{ github.event.pull_request.head.ref }}
cm run script --quiet --tags=get,sys-utils-cm
cm run script --quiet --tags=install,prebuilt,llvm --version=${{ matrix.llvm-version }}
- name: Test MLPerf Inference MIL C++ ResNet50
run: |
cmr "app mlperf inference cpp" -v --quiet
2 changes: 1 addition & 1 deletion .github/workflows/test-mlperf-inference-resnet50.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [ "3.8" ]
python-version: [ "3.12", "3.9" ]
backend: [ "onnxruntime", "tf" ]
implementation: [ "python", "cpp" ]
exclude:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test-mlperf-inference-retinanet.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [ "3.9" ]
python-version: [ "3.12", "3.9" ]
backend: [ "onnxruntime", "pytorch" ]
implementation: [ "python", "cpp" ]
exclude:
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/test-mlperf-inference-rnnt.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [ "3.9" ]
python-version: [ "3.12", "3.9" ]
backend: [ "pytorch" ]
precision: [ "fp32" ]

Expand All @@ -35,4 +35,4 @@ jobs:
cm run script --quiet --tags=get,sys-utils-cm
- name: Test MLPerf Inference RNNT
run: |
cm run script --tags=run,mlperf,inference,generate-run-cmds,_performance-only --submitter="cTuning" --model=rnnt --backend=${{ matrix.backend }} --device=cpu --scenario=Offline --test_query_count=5 --precision=${{ matrix.precision }} --target_qps=5 --adr.ml-engine-pytorch.version=1.13.0 --adr.ml-engine-torchvision.version=0.14.1 --adr.librosa.version_max=0.9.1 -v --quiet
cm run script --tags=run,mlperf,inference,generate-run-cmds,_performance-only --submitter="cTuning" --model=rnnt --backend=${{ matrix.backend }} --device=cpu --scenario=Offline --test_query_count=5 --precision=${{ matrix.precision }} --target_qps=5 -v --quiet
2 changes: 1 addition & 1 deletion .github/workflows/test-mlperf-inference-tvm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [ "3.10" ]
python-version: [ "3.12", "3.10" ]
backend: [ "tvm-onnx" ]

steps:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
# For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions

name: MLPerf loadgen with HuggingFace bert onnx fp32 squad model

on:
pull_request:
branches: [ "master", "dev" ]
paths:
- '.github/workflows/test-mlperf-loadgen-onnx-huggingface-bert-fp32-squad.yml'
- 'cm-mlops/**'
- '!cm-mlops/**.md'

jobs:
build:

runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
python-version: [ "3.12", "3.9" ]

steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python3 -m pip install cmind
cm pull repo --url=${{ github.event.pull_request.head.repo.html_url }} --checkout=${{ github.event.pull_request.head.ref }}
cm run script --quiet --tags=get,sys-utils-cm
- name: Test MLPerf loadgen with HuggingFace bert onnx fp32 squad model
run: |
cmr "python app loadgen-generic _onnxruntime _custom _huggingface _model-stub.ctuning/mlperf-inference-bert-onnx-fp32-squad-v1.1" --adr.hf-downloader.model_filename=model.onnx --quiet
27 changes: 19 additions & 8 deletions cm-mlops/automation/script/module.py
Original file line number Diff line number Diff line change
Expand Up @@ -788,7 +788,9 @@ def _run(self, i):
if r['return'] > 0:
return r

warnings = r.get('warnings', [])
warnings = meta.get('warnings', [])
if len(r.get('warnings', [])) >0:
warnings += r['warnings']

variation_tags_string = r['variation_tags_string']
explicit_variation_tags = r['explicit_variation_tags']
Expand Down Expand Up @@ -867,7 +869,8 @@ def _run(self, i):
if r['return']>0: return r


update_env_with_values(env)
r = update_env_with_values(env)
if r['return']>0: return r



Expand Down Expand Up @@ -1006,7 +1009,8 @@ def _run(self, i):
if verbose:
print (recursion_spaces + ' - Processing env after dependencies ...')

update_env_with_values(env)
r = update_env_with_values(env)
if r['return']>0: return r


# Check chain of prehook dependencies on other CM scripts. (No execution of customize.py for cached scripts)
Expand Down Expand Up @@ -1251,7 +1255,8 @@ def _run(self, i):
if verbose:
print (recursion_spaces + ' - Processing env after docker run dependencies ...')

update_env_with_values(env)
r = update_env_with_values(env)
if r['return']>0: return r

# Check chain of dependencies on other CM scripts
if len(deps)>0:
Expand All @@ -1266,7 +1271,8 @@ def _run(self, i):
if verbose:
print (recursion_spaces + ' - Processing env after dependencies ...')

update_env_with_values(env)
r = update_env_with_values(env)
if r['return']>0: return r

# Clean some output files
clean_tmp_files(clean_files, recursion_spaces)
Expand Down Expand Up @@ -2822,7 +2828,8 @@ def _run_deps(self, deps, clean_env_keys_deps, env, state, const, const_state, a

utils.merge_dicts({'dict1':ii, 'dict2':d, 'append_lists':True, 'append_unique':True})

update_env_with_values(ii['env']) #to update env local to a dependency
r = update_env_with_values(ii['env']) #to update env local to a dependency
if r['return']>0: return r

r = self.cmind.access(ii)
if r['return']>0: return r
Expand All @@ -2834,7 +2841,8 @@ def _run_deps(self, deps, clean_env_keys_deps, env, state, const, const_state, a

# Restore local env
env.update(tmp_env)
update_env_with_values(env)
r = update_env_with_values(env)
if r['return']>0: return r

return {'return': 0}

Expand Down Expand Up @@ -3988,6 +3996,9 @@ def update_env_with_values(env, fail_on_not_found=False):
"""
import re
for key in env:
if key.startswith("+") and type(env[key]) != list:
return {'return': 1, 'error': 'List value expected for {} in env'.format(key)}

value = env[key]

# Check cases such as --env.CM_SKIP_COMPILE
Expand Down Expand Up @@ -4017,7 +4028,7 @@ def update_env_with_values(env, fail_on_not_found=False):

env[key] = value

return
return {'return': 0}


##############################################################################
Expand Down
5 changes: 3 additions & 2 deletions cm-mlops/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
cmind >= 1.4.0
cmind>=2.0.1
pyyaml
requests
setuptools

giturlparse
4 changes: 4 additions & 0 deletions cm-mlops/script/app-mlperf-inference-reference/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,7 @@ deps:
- onnxruntime
- tf
- tflite
- pytorch

# Detect TensorRT if required
- tags: get,nvidia,tensorrt
Expand Down Expand Up @@ -835,6 +836,9 @@ variations:
names:
- ml-engine-pytorch
- pytorch
skip_if_env:
CM_MLPERF_DEVICE:
- gpu
add_deps_recursive:
inference-src:
tags: _deeplearningexamples
Expand Down
2 changes: 1 addition & 1 deletion cm-mlops/script/build-docker-image/customize.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ def preprocess(i):
CM_DOCKER_BUILD_ARGS.append( "CM_GH_TOKEN="+env['CM_GH_TOKEN'] )

if CM_DOCKER_BUILD_ARGS:
build_args = "--build-arg "+ " --build-arg".join(CM_DOCKER_BUILD_ARGS)
build_args = "--build-arg "+ " --build-arg ".join(CM_DOCKER_BUILD_ARGS)
else:
build_args = ""

Expand Down
2 changes: 1 addition & 1 deletion cm-mlops/script/build-dockerfile/dockerinfo.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"python-packages": [
"cmind", "requests", "giturlparse"
"cmind", "requests", "giturlparse", "tabulate"
],
"ARGS": [
"CM_GH_TOKEN"
Expand Down
2 changes: 1 addition & 1 deletion cm-mlops/script/download-file/customize.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,7 @@ def preprocess(i):
elif tool == "rclone":
if env.get('CM_RCLONE_CONFIG_CMD', '') != '':
env['CM_DOWNLOAD_CONFIG_CMD'] = env['CM_RCLONE_CONFIG_CMD']
env['CM_DOWNLOAD_CMD'] = f"rclone copy {url} {os.path.join(os.getcwd(), env['CM_DOWNLOAD_FILENAME'])} -P"
env['CM_DOWNLOAD_CMD'] = f"rclone sync {url} {os.path.join(os.getcwd(), env['CM_DOWNLOAD_FILENAME'])} -P"

filename = env['CM_DOWNLOAD_FILENAME']
env['CM_DOWNLOAD_DOWNLOADED_FILENAME'] = filename
Expand Down
Loading
Loading