Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI/Build] Update CPU tests to include all "standard" tests #5481

Merged
merged 29 commits into from
Nov 8, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
597cb35
Enable LLaVA test in CPU
DarkLight1337 Jun 13, 2024
845b465
Fix failing test on CPU due to unsupported dtype
DarkLight1337 Jun 13, 2024
5f92d96
Merge branch 'upstream' into test-llava-cpu
DarkLight1337 Jun 15, 2024
789b493
Merge branch 'upstream' into test-llava-cpu
DarkLight1337 Jun 19, 2024
e50b808
Merge branch 'upstream' into test-llava-cpu
DarkLight1337 Jun 20, 2024
8ba6e77
Merge branch 'upstream' into test-llava-cpu
DarkLight1337 Jun 21, 2024
783cb76
Install torchvision
DarkLight1337 Jun 21, 2024
e177bf8
Use CPU pypi index for torchvision
DarkLight1337 Jun 21, 2024
7273b45
Merge branch 'upstream' into test-llava-cpu
DarkLight1337 Oct 31, 2024
d926082
format
DarkLight1337 Oct 31, 2024
fe0ef62
Use bfloat16
DarkLight1337 Oct 31, 2024
f12d39f
Update
DarkLight1337 Oct 31, 2024
656a499
Update test dependencies
DarkLight1337 Oct 31, 2024
649525f
Merge branch 'upstream' into test-llava-cpu
DarkLight1337 Nov 2, 2024
8e33605
Remove unnecessary `is_cpu()` checks
DarkLight1337 Nov 7, 2024
08e242e
Merge branch 'upstream' into test-llava-cpu
DarkLight1337 Nov 7, 2024
1e63f85
Update
DarkLight1337 Nov 7, 2024
c09b140
Remove unnecessary args
DarkLight1337 Nov 7, 2024
6e6b838
Update
DarkLight1337 Nov 7, 2024
7bc3ad1
Merge branch 'upstream' into test-llava-cpu
DarkLight1337 Nov 7, 2024
e41db03
Fix missing library
DarkLight1337 Nov 7, 2024
8e3cf44
Fix loading image embeds on CPU
DarkLight1337 Nov 7, 2024
cd1cd15
Fix errors not being propagated to CI
DarkLight1337 Nov 7, 2024
b401cb9
Fix missing libraries
DarkLight1337 Nov 7, 2024
431a5c8
Embedding models are not supported for CPU backend
DarkLight1337 Nov 8, 2024
0df552f
Merge branch 'upstream' into test-llava-cpu
DarkLight1337 Nov 8, 2024
8c817e4
Chunked prefill not supported for CPU
DarkLight1337 Nov 8, 2024
4c39939
Fix installation
DarkLight1337 Nov 8, 2024
9ef98fa
Add `cpu_model` mark
DarkLight1337 Nov 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .buildkite/run-cpu-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,6 @@ docker exec cpu-test bash -c "python3 examples/offline_inference.py"
# Run basic model test
docker exec cpu-test bash -c "cd tests;
pip install pytest Pillow protobuf
bash ../.buildkite/download-images.sh
cd ../
pytest -v -s tests/models -m \"not llava\" --ignore=tests/models/test_embedding.py --ignore=tests/models/test_registry.py"
pytest -v -s tests/models --ignore=tests/models/test_embedding.py --ignore=tests/models/test_registry.py"
9 changes: 8 additions & 1 deletion tests/models/test_llava.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from typing import List, Tuple

import pytest
import torch
from transformers import AutoTokenizer

from vllm.config import VisionLanguageConfig
Expand Down Expand Up @@ -65,9 +66,15 @@ def vllm_to_hf_output(vllm_output: Tuple[List[int], str],
return hf_input_ids, hf_output_str


# TODO: remove this after CPU float16 support ready
target_dtype = "float"
if torch.cuda.is_available():
target_dtype = "half"


# TODO: Add test for `tensor_parallel_size` [ref: PR #3883]
@pytest.mark.parametrize("model_and_config", model_and_vl_config)
@pytest.mark.parametrize("dtype", ["half"])
@pytest.mark.parametrize("dtype", [target_dtype])
@pytest.mark.parametrize("max_tokens", [128])
def test_models(hf_runner, vllm_runner, hf_images, vllm_images,
model_and_config, dtype: str, max_tokens: int) -> None:
Expand Down
9 changes: 8 additions & 1 deletion tests/models/test_llava_next.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from typing import List, Tuple

import pytest
import torch
from transformers import AutoTokenizer

from vllm.config import VisionLanguageConfig
Expand Down Expand Up @@ -72,11 +73,17 @@ def vllm_to_hf_output(vllm_output: Tuple[List[int], str],
return hf_input_ids, hf_output_str


# TODO: remove this after CPU float16 support ready
target_dtype = "float"
if torch.cuda.is_available():
target_dtype = "half"


@pytest.mark.xfail(
reason="Inconsistent image processor being used due to lack "
"of support for dynamic image token replacement")
@pytest.mark.parametrize("model_and_config", model_and_vl_config)
@pytest.mark.parametrize("dtype", ["half"])
@pytest.mark.parametrize("dtype", [target_dtype])
@pytest.mark.parametrize("max_tokens", [128])
def test_models(hf_runner, vllm_runner, hf_images, vllm_images,
model_and_config, dtype: str, max_tokens: int) -> None:
Expand Down
Loading