This repository has been archived by the owner on Oct 11, 2024. It is now read-only.
andy-neuma triggered nightly on refs/heads/main #61
nightly.yml
on: schedule
AWS-AVX2-32G-A10G-24G-Benchmark
/
BENCHMARK
7h 19m
NIGHTLY-MULTI
/
...
/
BUILD
23m 42s
NIGHTLY-SOLO
/
...
/
BUILD
45m 0s
Accuracy-Smoke-AWS-AVX2-32G-A10G-24G
/
LM-EVAL-SMOKE
1h 44m
AWS-AVX2-32G-A10G-24G-Benchmark
/
NM_GH_ACTION_BENCHMARK
15s
Matrix: NIGHTLY-MULTI / TEST
Matrix: NIGHTLY-SOLO / TEST
Annotations
2 errors and 3 warnings
NIGHTLY-SOLO / TEST (aws-avx2-192G-4-a10g-96G) / TEST
Failed to CreateArtifact: Received non-retryable error: Failed request: (409) Conflict: an artifact with this name already exists on the workflow run
|
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
# :warning: **Performance Alert** :warning:
Possible performance regression was detected for benchmark **'smaller_is_better'**.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold `1.10`.
| Benchmark suite | Current: 58811dff88f3ffccd4f15e0c7a80e42881b04b91 | Previous: dd3a2885c57b7eb7f4137476e3c9b2e291a582fa | Ratio |
|-|-|-|-|
| `{"name": "mean_tpot_ms", "description": "VLLM Serving - 2:4 Sparse\nmodel - neuralmagic/OpenHermes-2.5-Mistral-7B-pruned2.4\nmax-model-len - 4096\nsparsity - semi_structured_sparse_w16a16\nbenchmark_serving {\n \"nr-qps-pair_\": \"1500,5\",\n \"dataset\": \"sharegpt\"\n}", "gpu_description": "NVIDIA A10G x 1", "vllm_version": "0.2.0", "python_version": "3.10.12 (main, Mar 7 2024, 18:39:53) [GCC 9.4.0]", "torch_version": "2.1.2+cu121"}` | `136.3907258210871` ms | `122.77222682416993` ms | `1.11` |
| `{"name": "median_tpot_ms", "description": "VLLM Serving - 2:4 Sparse\nmodel - neuralmagic/OpenHermes-2.5-Mistral-7B-pruned2.4\nmax-model-len - 4096\nsparsity - semi_structured_sparse_w16a16\nbenchmark_serving {\n \"nr-qps-pair_\": \"1500,5\",\n \"dataset\": \"sharegpt\"\n}", "gpu_description": "NVIDIA A10G x 1", "vllm_version": "0.2.0", "python_version": "3.10.12 (main, Mar 7 2024, 18:39:53) [GCC 9.4.0]", "torch_version": "2.1.2+cu121"}` | `124.93099535644006` ms | `110.70336996676977` ms | `1.13` |
This comment was automatically generated by [workflow](https://github.com/neuralmagic/nm-vllm/actions?query=workflow%3ANightly) using [github-action-benchmark](https://github.com/marketplace/actions/continuous-benchmark).
Comment was generated at https://github.com/neuralmagic/nm-vllm/commit/58811dff88f3ffccd4f15e0c7a80e42881b04b91#commitcomment-140704631
|
NIGHTLY-SOLO / TEST (aws-avx2-192G-4-a10g-96G) / TEST
This job failure may be caused by using an out of date self-hosted runner. You are currently using runner version 2.314.1. Please update to the latest version 2.315.0
|
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
Performance alert! Previous value was 122.77222682416993 and current value is 136.3907258210871. It is 1.110924916401664x worse than previous exceeding a ratio threshold 1.1
|
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
Performance alert! Previous value was 110.70336996676977 and current value is 124.93099535644006. It is 1.1285202554713651x worse than previous exceeding a ratio threshold 1.1
|
Artifacts
Produced during runtime
Name | Size | |
---|---|---|
3.10.12-nm-vllm-0.2.0.tar.gz
Expired
|
447 KB |
|
3.11.4-nm-vllm-0.2.0.tar.gz
Expired
|
447 KB |
|
8585197395-aws-avx2-32G-a10g-24G
Expired
|
124 KB |
|
cc-vllm-html-aws-avx2-192G-4-a10g-96G
Expired
|
1.65 MB |
|
gh_action_benchmark_jsons-8585197395-aws-avx2-32G-a10g-24G
Expired
|
28.3 KB |
|
nm_vllm-0.2.0-cp310-cp310-manylinux_2_17_x86_64.whl
Expired
|
87.1 MB |
|
nm_vllm-0.2.0-cp311-cp311-manylinux_2_17_x86_64.whl
Expired
|
87.1 MB |
|