Skip to content

Commit

Permalink
pin GPU and use "--forked" for some tests (vllm-project#58)
Browse files Browse the repository at this point in the history
SUMMARY:
* update runner script to use `--forked` for tests in
`tests/distributed`
* enable all test points in `test_basic_distributed_correctness.py`
* pin GPU while running kernel tests. using `CUDA_VISIBLE_DEVICES=0`
when running `tests/kernels` and `tests/samplers`

TEST PLAN:
runs on remote push

modulo formatting ... this gets us a bit further ...

https://neuralmagic.testmo.net/automation/runs/view/8887

---------

Co-authored-by: andy-neuma <[email protected]>
  • Loading branch information
andy-neuma and andy-neuma authored Feb 27, 2024
1 parent 1a59725 commit 87861c8
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 3 deletions.
15 changes: 14 additions & 1 deletion .github/scripts/run-tests
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,21 @@ for TEST in "${TESTS_TO_RUN[@]}"
do
LOCAL_SUCCESS=0
RESULT_XML=$(echo ${TEST} | sed -e "s/${TEST_DIR}/${RESULTS_DIR}/" | sed -e "s/.py/.xml/")
pytest --junitxml=${RESULT_XML} ${TEST} || LOCAL_SUCCESS=$?

# this is a bit messy and brittle, but certain tests
# need to be run with specific options
if [[ "${TEST}" == *"kernels"* ]]; then
CUDA_VISIBLE_DEVICES=0 pytest --junitxml=${RESULT_XML} ${TEST} || LOCAL_SUCCESS=$?
elif [[ "${TEST}" == *"samplers"* ]]; then
CUDA_VISIBLE_DEVICES=0 pytest --junitxml=${RESULT_XML} ${TEST} || LOCAL_SUCCESS=$?
elif [[ "${TEST}" == *"distributed"* ]]; then
pytest --forked --junitxml=${RESULT_XML} ${TEST} || LOCAL_SUCCESS=$?
else
pytest --junitxml=${RESULT_XML} ${TEST} || LOCAL_SUCCESS=$?
fi

SUCCESS=$((SUCCESS + LOCAL_SUCCESS))

done

if [ "${SUCCESS}" -eq "0" ]; then
Expand Down
3 changes: 1 addition & 2 deletions tests/distributed/test_basic_distributed_correctness.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,9 @@
import pytest
import torch

# TODO: just picking one, need to update test runner to selectively use "--forked"
MODELS = [
"facebook/opt-125m",
# "meta-llama/Llama-2-7b-hf",
"meta-llama/Llama-2-7b-hf",
]


Expand Down

0 comments on commit 87861c8

Please sign in to comment.