Skip to content
This repository has been archived by the owner on Oct 11, 2024. It is now read-only.

Commit

Permalink
Disable marlin models
Browse files Browse the repository at this point in the history
  • Loading branch information
dbarbuzzi committed May 9, 2024
1 parent e972635 commit 73adc9f
Showing 1 changed file with 27 additions and 24 deletions.
51 changes: 27 additions & 24 deletions tests/accuracy/lm-eval-tasks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,15 @@
value: 0.09855951478392722
- name: "exact_match,flexible-extract"
value: 0.10083396512509477
- model_name: "neuralmagic/llama-2-7b-chat-marlin"
tasks:
- name: "gsm8k"
metrics:
- name: "exact_match,strict-match"
value: 0.14101592115238817
- name: "exact_match,flexible-extract"
value: 0.1652767247915087
# - model_name: "neuralmagic/llama-2-7b-chat-marlin"
# tasks:
# - name: "gsm8k"
# metrics:
# - name: "exact_match,strict-match"
# value: 0.14101592115238817
# - name: "exact_match,flexible-extract"
# value: 0.1652767247915087

# Mistral 7B: FP16, FP16 sparse, marlin
- model_name: "teknium/OpenHermes-2.5-Mistral-7B"
tasks:
Expand All @@ -42,23 +43,25 @@
value: 0.5269143290371494
extra_args:
--sparsity: "sparse_w16a16"
- model_name: "neuralmagic/OpenHermes-2.5-Mistral-7B-marlin"
tasks:
- name: "gsm8k"
metrics:
- name: "exact_match,strict-match"
value: 0.4935557240333586
- name: "exact_match,flexible-extract"
value: 0.5868081880212282
# - model_name: "neuralmagic/OpenHermes-2.5-Mistral-7B-marlin"
# tasks:
# - name: "gsm8k"
# metrics:
# - name: "exact_match,strict-match"
# value: 0.4935557240333586
# - name: "exact_match,flexible-extract"
# value: 0.5868081880212282

# Phi 2: marlin
- model_name: "neuralmagic/phi-2-super-marlin"
tasks:
- name: "gsm8k"
metrics:
- name: "exact_match,strict-match"
value: 0.49962092494313876
- name: "exact_match,flexible-extract"
value: 0.5041698256254739
# - model_name: "neuralmagic/phi-2-super-marlin"
# tasks:
# - name: "gsm8k"
# metrics:
# - name: "exact_match,strict-match"
# value: 0.49962092494313876
# - name: "exact_match,flexible-extract"
# value: 0.5041698256254739

# Mixtral: FP16
- model_name: "mistralai/Mixtral-8x7B-Instruct-v0.1"
tasks:
Expand Down

1 comment on commit 73adc9f

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bigger_is_better

Benchmark suite Current: 73adc9f Previous: df1f1a0 Ratio
{"name": "request_throughput", "description": "VLLM Engine throughput - synthetic\nmodel - NousResearch/Llama-2-7b-chat-hf\nmax_model_len - 4096\nbenchmark_throughput {\n \"use-all-available-gpus_\": \"\",\n \"input-len\": 256,\n \"output-len\": 128,\n \"num-prompts\": 1000\n}", "gpu_description": "NVIDIA A10G x 1", "vllm_version": "0.2.0", "python_version": "3.10.12 (main, Mar 7 2024, 18:39:53) [GCC 9.4.0]", "torch_version": "2.2.1+cu121"} 4.027209690853166 prompts/s 3.80234884054723 prompts/s 0.94
{"name": "token_throughput", "description": "VLLM Engine throughput - synthetic\nmodel - NousResearch/Llama-2-7b-chat-hf\nmax_model_len - 4096\nbenchmark_throughput {\n \"use-all-available-gpus_\": \"\",\n \"input-len\": 256,\n \"output-len\": 128,\n \"num-prompts\": 1000\n}", "gpu_description": "NVIDIA A10G x 1", "vllm_version": "0.2.0", "python_version": "3.10.12 (main, Mar 7 2024, 18:39:53) [GCC 9.4.0]", "torch_version": "2.2.1+cu121"} 1546.4485212876157 tokens/s 1460.1019547701362 tokens/s 0.94

This comment was automatically generated by workflow using github-action-benchmark.

Please sign in to comment.