Upstream merge 25 01 27#391

Merged

gshtras merged 109 commits intomainfrom upstream_merge_25_01_27

Jan 28, 2025

+6,363-1,987

Commits on Jan 20, 2025

[Misc] Update CODEOWNERS (vllm-project#12229 )
ywang96
authored
fix: update platform detection for M-series arm based MacBook processors (vllm-project#12227 )
isikhi
authored

Commits on Jan 21, 2025

[misc] add cuda runtime version to usage data (vllm-project#12190 )

youkaichao
and
ywang96
authored
[bugfix] catch xgrammar unsupported array constraints (vllm-project#12210 )
Jason-CKY
authored
[Kernel] optimize moe_align_block_size for cuda graph and large num_experts (e.g. DeepSeek-V3) (vllm-project#12222 )

authored
Add quantization and guided decoding CODEOWNERS (vllm-project#12228 )
mgoin
authored
[AMD][Build] Porting dockerfiles from the ROCm/vllm fork (vllm-project#11777 )
gshtras
authored
[BugFix] Fix GGUF tp>1 when vocab_size is not divisible by 64 (vllm-project#12230 )
NickLucche
authored
[ci/build] disable failed and flaky tests (vllm-project#12240 )
youkaichao
authored
[Misc] Rename MultiModalInputsV2 -> MultiModalInputs (vllm-project#12244 )
DarkLight1337
authored
[Misc]Add BNB quantization for PaliGemmaForConditionalGeneration (vllm-project#12237 )
jeejeelee
authored
[Misc] Remove redundant TypeVar from base model (vllm-project#12248 )
DarkLight1337
authored
[Bugfix] Fix mm_limits access for merged multi-modal processor (vllm-project#12252 )
DarkLight1337
authored
[torch.compile] transparent compilation with more logging (vllm-project#12246 )
youkaichao
authored
[V1][Bugfix] Fix data item ordering in mixed-modality inference (vllm-project#12259 )
ywang96
authored
Remove pytorch comments for outlines + compressed-tensors (vllm-project#12260 )
tdoublep
authored
[Platform] improve platforms getattr (vllm-project#12264 )
MengqingCao
authored
[ci/build] update nightly torch for gh200 test (vllm-project#12270 )
youkaichao
authored
[Bugfix] fix race condition that leads to wrong order of token returned (vllm-project#10802 )
joennlae
authored
[Kernel] fix moe_align_block_size error condition (vllm-project#12239 )
jinzhen-lin
authored
[v1][stats][1/n] Add RequestStatsUpdate and RequestStats types (vllm-project#10907 )
rickyyx
authored
[Bugfix] Multi-sequence broken (vllm-project#11898 )
andylolu2
authored
[Misc] Remove experimental dep from tracing.py (vllm-project#12007 )
codefromthecrypt
authored
[Misc] Set default backend to SDPA for get_vit_attn_backend (vllm-project#12235 )
wangxiyuan
authored
[Core] Free CPU pinned memory on environment cleanup (vllm-project#10477 )
janimo
authored
[bugfix] moe tuning. rm is_navi() (vllm-project#12273 )
divakar-amd
authored
[BUGFIX] When skip_tokenize_init and multistep are set, execution crashes (vllm-project#12277 )

maleksan85
and
maleksan85
authored
[Documentation][AMD] Add information about prebuilt ROCm vLLM docker for perf validation purpose (vllm-project#12281 )
hongxiayang
authored

Commits on Jan 22, 2025

Commits on Jan 23, 2025

Commits on Jan 24, 2025

Commits on Jan 26, 2025

Commits on Jan 27, 2025

Commits on Jan 28, 2025