Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Misc] Define common requirements #3841

Merged
merged 17 commits into from
Apr 5, 2024
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ jobs:
matrix:
os: ['ubuntu-20.04']
python-version: ['3.8', '3.9', '3.10', '3.11']
pytorch-version: ['2.1.2'] # Must be the most recent version that meets requirements.txt.
pytorch-version: ['2.1.2'] # Must be the most recent version that meets requirements-cuda.txt.
cuda-version: ['11.8', '12.1']

steps:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/scripts/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ LD_LIBRARY_PATH=${cuda_home}/lib64:$LD_LIBRARY_PATH

# Install requirements
$python_executable -m pip install wheel packaging
$python_executable -m pip install -r requirements.txt
$python_executable -m pip install -r requirements-cuda.txt

# Limit the number of parallel jobs to avoid OOM
export MAX_JOBS=1
Expand Down
1 change: 0 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ Express your support on Twitter if vLLM aids you, or simply offer your appreciat
### Build from source

```bash
pip install -r requirements.txt
pip install -e . # This may take several minutes.
```

Expand Down
13 changes: 8 additions & 5 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,10 @@ RUN ldconfig /usr/local/cuda-12.1/compat/
WORKDIR /workspace

# install build and runtime dependencies
COPY requirements.txt requirements.txt
COPY requirements-common.txt requirements-common.txt
COPY requirements-cuda.txt requirements-cuda.txt
RUN --mount=type=cache,target=/root/.cache/pip \
pip install -r requirements.txt
pip install -r requirements-cuda.txt

# install development dependencies
COPY requirements-dev.txt requirements-dev.txt
Expand All @@ -43,7 +44,8 @@ COPY csrc csrc
COPY setup.py setup.py
COPY cmake cmake
COPY CMakeLists.txt CMakeLists.txt
COPY requirements.txt requirements.txt
COPY requirements-common.txt requirements-common.txt
COPY requirements-cuda.txt requirements-cuda.txt
COPY pyproject.toml pyproject.toml
COPY vllm/__init__.py vllm/__init__.py

Expand Down Expand Up @@ -111,9 +113,10 @@ RUN apt-get update -y \
&& apt-get install -y python3-pip

WORKDIR /workspace
COPY requirements.txt requirements.txt
COPY requirements-common.txt requirements-common.txt
COPY requirements-cuda.txt requirements-cuda.txt
RUN --mount=type=cache,target=/root/.cache/pip \
pip install -r requirements.txt
pip install -r requirements-cuda.txt

# Install flash attention (from pre-built wheel)
RUN --mount=type=bind,from=flash-attn-builder,src=/usr/src/flash-attention-v2,target=/usr/src/flash-attention-v2 \
Expand Down
3 changes: 2 additions & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
include LICENSE
include requirements.txt
include requirements-common.txt
include requirements-cuda.txt
include CMakeLists.txt

recursive-include cmake *
Expand Down
10 changes: 2 additions & 8 deletions requirements.txt → requirements-common.txt
Original file line number Diff line number Diff line change
@@ -1,19 +1,13 @@
cmake>=3.21
cmake >= 3.21
ninja # For faster builds.
psutil
ray >= 2.9
sentencepiece # Required for LLaMA tokenizer.
numpy
torch == 2.1.2
requests
py-cpuinfo
transformers >= 4.39.1 # Required for StarCoder2 & Llava.
xformers == 0.0.23.post1 # Required for CUDA 12.1.
fastapi
uvicorn[standard]
pydantic >= 2.0 # Required for OpenAI server.
prometheus_client >= 0.18.0
pynvml == 11.5.0
triton >= 2.1.0
outlines == 0.0.34
tiktoken == 0.6.0 # Required for DBRX tokenizer
tiktoken == 0.6.0 # Required for DBRX tokenizer
19 changes: 5 additions & 14 deletions requirements-cpu.txt
Original file line number Diff line number Diff line change
@@ -1,15 +1,6 @@
cmake>=3.21
ninja # For faster builds.
psutil
ray >= 2.9
sentencepiece # Required for LLaMA tokenizer.
numpy
transformers >= 4.38.0 # Required for Gemma.
fastapi
uvicorn[standard]
pydantic >= 2.0 # Required for OpenAI server.
prometheus_client >= 0.18.0
# Common dependencies
-r requirements-common.txt

# Dependencies for x86_64 CPUs
torch == 2.1.2+cpu
triton >= 2.1.0
filelock == 3.13.3
py-cpuinfo
triton >= 2.1.0 # FIXME(woosuk): This is a hack to avoid import error.
10 changes: 10 additions & 0 deletions requirements-cuda.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Common dependencies
-r requirements-common.txt

# Dependencies for NVIDIA GPUs
ray >= 2.9
torch == 2.1.2
xformers == 0.0.23.post1 # Required for CUDA 12.1.
pynvml == 11.5.0
triton >= 2.1.0
outlines == 0.0.34 # Requires torch >= 2.1.0
13 changes: 4 additions & 9 deletions requirements-neuron.txt
Original file line number Diff line number Diff line change
@@ -1,12 +1,7 @@
sentencepiece # Required for LLaMA tokenizer.
numpy
# Common dependencies
-r requirements-common.txt

# Dependencies for Neuron devices
transformers-neuronx >= 0.9.0
torch-neuronx >= 2.1.0
neuronx-cc
fastapi
uvicorn[standard]
pydantic >= 2.0 # Required for OpenAI server.
prometheus_client >= 0.18.0
requests
psutil
py-cpuinfo
21 changes: 4 additions & 17 deletions requirements-rocm.txt
Original file line number Diff line number Diff line change
@@ -1,18 +1,5 @@
cmake>=3.21
ninja # For faster builds.
typing-extensions>=4.8.0
starlette
requests
py-cpuinfo
psutil
# Common dependencies
-r requirements-common.txt

# Dependencies for AMD GPUs
ray == 2.9.3
sentencepiece # Required for LLaMA tokenizer.
numpy
tokenizers>=0.15.0
transformers >= 4.39.1 # Required for StarCoder2 & Llava.
fastapi
uvicorn[standard]
pydantic >= 2.0 # Required for OpenAI server.
prometheus_client >= 0.18.0
outlines == 0.0.34
tiktoken == 0.6.0 # Required for DBRX tokenizer
25 changes: 15 additions & 10 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -325,22 +325,27 @@ def read_readme() -> str:

def get_requirements() -> List[str]:
"""Get Python package dependencies from requirements.txt."""
if _is_cuda():
with open(get_path("requirements.txt")) as f:

def _read_requirements(filename: str) -> List[str]:
with open(get_path(filename)) as f:
requirements = f.read().strip().split("\n")
for line in requirements:
if line.startswith("-r "):
requirements.remove(line)
requirements += _read_requirements(line.split()[1])
return requirements

if _is_cuda():
requirements = _read_requirements("requirements-cuda.txt")
elif _is_hip():
with open(get_path("requirements-rocm.txt")) as f:
requirements = f.read().strip().split("\n")
requirements = _read_requirements("requirements-rocm.txt")
elif _is_neuron():
with open(get_path("requirements-neuron.txt")) as f:
requirements = f.read().strip().split("\n")
requirements = _read_requirements("requirements-neuron.txt")
elif _is_cpu():
with open(get_path("requirements-cpu.txt")) as f:
requirements = f.read().strip().split("\n")
requirements = _read_requirements("requirements-cpu.txt")
else:
raise ValueError(
"Unsupported platform, please use CUDA, ROCM or Neuron.")

"Unsupported platform, please use CUDA, ROCm, Neuron, or CPU.")
return requirements


Expand Down
Loading