Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

mgoin triggered nightly on refs/heads/merge-upstream-0.4.0-to-main #53

mgoin triggered nightly on refs/heads/merge-upstream-0.4.0-to-main

mgoin triggered nightly on refs/heads/merge-upstream-0.4.0-to-main #53

Manually triggered April 1, 2024 21:14
Status Failure
Total duration 7h 43m 15s
Artifacts 7

nightly.yml

on: workflow_dispatch
AWS-AVX2-32G-A10G-24G-Benchmark  /  BENCHMARK
7h 40m
AWS-AVX2-32G-A10G-24G-Benchmark / BENCHMARK
NIGHTLY-MULTI  /  ...  /  BUILD
23m 28s
NIGHTLY-MULTI / BUILD / BUILD
AWS-AVX2-32G-A10G-24G-Accuracy  /  LM-EVAL
1h 0m
AWS-AVX2-32G-A10G-24G-Accuracy / LM-EVAL
AWS-AVX2-32G-A10G-24G-Benchmark  /  NM_GH_ACTION_BENCHMARK
19s
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
Matrix: NIGHTLY-MULTI / TEST
Matrix: NIGHTLY-SOLO / TEST
Fit to window
Zoom out
Zoom in

Annotations

4 errors and 10 warnings
AWS-AVX2-32G-A10G-24G-Accuracy / LM-EVAL
The job running on runner avx2_a10g_i-0ecb99690d7452b0c has exceeded the maximum execution time of 60 minutes.
NIGHTLY-MULTI / TEST (aws-avx2-192G-4-a10g-96G) / TEST
The job running on runner avx2_a10g_4_i-0da42d345eff9780a has exceeded the maximum execution time of 240 minutes.
NIGHTLY-SOLO / TEST (aws-avx2-192G-4-a10g-96G) / TEST
Process completed with exit code 1.
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
# :warning: **Performance Alert** :warning: Possible performance regression was detected for benchmark **'smaller_is_better'**. Benchmark result of this commit is worse than the previous benchmark result exceeding threshold `1.10`. | Benchmark suite | Current: b3d607a9022ebd492a4c220401cad0b1ae126f8c | Previous: bdfdb774576b34b4cae98a200b146c19cd24d24c | Ratio | |-|-|-|-| | `{"name": "median_tpot_ms", "description": "VLLM Serving - Dense\nmodel - TheBloke/OpenHermes-2.5-Mistral-7B-GPTQ\nmax-model-len - 4096\nsparsity - None\nbenchmark_serving {\n \"nr-qps-pair_\": \"750,2.5\",\n \"dataset\": \"sharegpt\"\n}", "gpu_description": "NVIDIA A10G x 1", "vllm_version": "0.1.0", "python_version": "3.10.12 (main, Mar 7 2024, 18:39:53) [GCC 9.4.0]", "torch_version": "2.1.2+cu121"}` | `99.5452885821862` ms | `85.08840363727279` ms | `1.17` | | `{"name": "median_request_latency", "description": "VLLM Serving - Dense\nmodel - neuralmagic/OpenHermes-2.5-Mistral-7B-marlin\nmax-model-len - 4096\nsparsity - None\nbenchmark_serving {\n \"nr-qps-pair_\": \"750,2.5\",\n \"dataset\": \"sharegpt\"\n}", "gpu_description": "NVIDIA A10G x 1", "vllm_version": "0.1.0", "python_version": "3.10.12 (main, Mar 7 2024, 18:39:53) [GCC 9.4.0]", "torch_version": "2.1.2+cu121"}` | `2550.2822674989147` ms | `2304.388393999943` ms | `1.11` | | `{"name": "mean_tpot_ms", "description": "VLLM Serving - Dense\nmodel - neuralmagic/OpenHermes-2.5-Mistral-7B-marlin\nmax-model-len - 4096\nsparsity - None\nbenchmark_serving {\n \"nr-qps-pair_\": \"750,2.5\",\n \"dataset\": \"sharegpt\"\n}", "gpu_description": "NVIDIA A10G x 1", "vllm_version": "0.1.0", "python_version": "3.10.12 (main, Mar 7 2024, 18:39:53) [GCC 9.4.0]", "torch_version": "2.1.2+cu121"}` | `18.67525323370096` ms | `16.874880133747496` ms | `1.11` | | `{"name": "median_tpot_ms", "description": "VLLM Serving - Dense\nmodel - neuralmagic/OpenHermes-2.5-Mistral-7B-marlin\nmax-model-len - 4096\nsparsity - None\nbenchmark_serving {\n \"nr-qps-pair_\": \"750,2.5\",\n \"dataset\": \"sharegpt\"\n}", "gpu_description": "NVIDIA A10G x 1", "vllm_version": "0.1.0", "python_version": "3.10.12 (main, Mar 7 2024, 18:39:53) [GCC 9.4.0]", "torch_version": "2.1.2+cu121"}` | `16.787582091402747` ms | `15.064373402042957` ms | `1.11` | | `{"name": "median_request_latency", "description": "VLLM Serving - 2:4 Sparse\nmodel - neuralmagic/OpenHermes-2.5-Mistral-7B-pruned2.4\nmax-model-len - 4096\nsparsity - semi_structured_sparse_w16a16\nbenchmark_serving {\n \"nr-qps-pair_\": \"1500,5\",\n \"dataset\": \"sharegpt\"\n}", "gpu_description": "NVIDIA A10G x 1", "vllm_version": "0.1.0", "python_version": "3.10.12 (main, Mar 7 2024, 18:39:53) [GCC 9.4.0]", "torch_version": "2.1.2+cu121"}` | `15734.700635499394` ms | `12165.00634949989` ms | `1.29` | | `{"name": "mean_tpot_ms", "description": "VLLM Serving - 2:4 Sparse\nmodel - neuralmagic/OpenHermes-2.5-Mistral-7B-pruned2.4\nmax-model-len - 4096\nsparsity - semi_structured_sparse_w16a16\nbenchmark_serving {\n \"nr-qps-pair_\": \"1500,5\",\n \"dataset\": \"sharegpt\"\n}", "gpu_description": "NVIDIA A10G x 1", "vllm_version": "0.1.0", "python_version": "3.10.12 (main, Mar 7 2024, 18:39:53) [GCC 9.4.0]", "torch_version": "2.1.2+cu121"}` | `145.87599154684392` ms | `108.87166010577987` ms | `1.34` | | `{"name": "median_tpot_ms", "description": "VLLM Serving - 2:4 Sparse\nmodel - neuralmagic/OpenHermes-2.5-Mistral-7B-pruned2.4\nmax-model-len - 4096\nsparsity - semi_structured_sparse_w16a16\nbenchmark_serving {\n \"nr-qps-pair_\": \"1500,5\",\n \"dataset\": \"sharegpt\"\n}", "gpu_description": "NVIDIA A10G x 1", "vllm_version": "0.1.0", "python_version": "3.10.12 (main, Mar 7 2024, 18:39:53) [GCC 9.4.0]", "torch_version": "2.1.2+cu121"}` | `133.68083343954564` ms | `97.20393972317379` ms | `1.38` | | `{"name": "median_request_latency", "description": "VLLM Serving - Sparse\nmodel - neuralmagic/OpenHermes-2.5-Mistral-7B-pruned50\nmax-model-len - 4096\nsparsity - sparse_w16a16\nbenchmark_serving {\n \"nr-qps-pair_\": \"1500,5\",\n \"dataset\": \"sharegpt\"\n}", "g
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
Performance alert! Previous value was 85.08840363727279 and current value is 99.5452885821862. It is 1.1699042916181894x worse than previous exceeding a ratio threshold 1.1
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
Performance alert! Previous value was 2304.388393999943 and current value is 2550.2822674989147. It is 1.1067067835175781x worse than previous exceeding a ratio threshold 1.1
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
Performance alert! Previous value was 16.874880133747496 and current value is 18.67525323370096. It is 1.1066895341290728x worse than previous exceeding a ratio threshold 1.1
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
Performance alert! Previous value was 15.064373402042957 and current value is 16.787582091402747. It is 1.1143896691465505x worse than previous exceeding a ratio threshold 1.1
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
Performance alert! Previous value was 12165.00634949989 and current value is 15734.700635499394. It is 1.293439574418821x worse than previous exceeding a ratio threshold 1.1
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
Performance alert! Previous value was 108.87166010577987 and current value is 145.87599154684392. It is 1.339889475416381x worse than previous exceeding a ratio threshold 1.1
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
Performance alert! Previous value was 97.20393972317379 and current value is 133.68083343954564. It is 1.3752614741774258x worse than previous exceeding a ratio threshold 1.1
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
Performance alert! Previous value was 63174.63465400033 and current value is 69967.52652400118. It is 1.1075256217500058x worse than previous exceeding a ratio threshold 1.1
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
Performance alert! Previous value was 26975.800461572657 and current value is 31053.952912183973. It is 1.151178181215445x worse than previous exceeding a ratio threshold 1.1
AWS-AVX2-32G-A10G-24G-Benchmark / NM_GH_ACTION_BENCHMARK
Performance alert! Previous value was 24898.172750000413 and current value is 29978.847964499437. It is 1.2040581558138213x worse than previous exceeding a ratio threshold 1.1

Artifacts

Produced during runtime
Name Size
3.10.12-nm-vllm-0.1.0.tar.gz Expired
404 KB
3.11.4-nm-vllm-0.1.0.tar.gz Expired
404 KB
8513837051-aws-avx2-32G-a10g-24G Expired
124 KB
cc-vllm-html Expired
1.52 MB
gh_action_benchmark_jsons-8513837051-aws-avx2-32G-a10g-24G Expired
29.2 KB
nm_vllm-0.1.0-cp310-cp310-linux_x86_64.whl Expired
87 MB
nm_vllm-0.1.0-cp311-cp311-linux_x86_64.whl Expired
87 MB