Add an option to use dummy weights #33

WoosukKwon · 2023-04-09T06:31:37Z

No description provided.

* Bucketing/Warmup WIP * Cleanup * Revert "Fix model_output_idx on HPU (vllm-project#27)" This reverts commit 90dfa92. * Rework selected_token_indices fix to also work with block_size padding * Simple prompt attention POC * Remove cumsum * MQA/GQA support for simple prompt_attention * Cleanup * Fix typo * Restore profiling runs

…ernel tuning script for rocm. Merge pull request vllm-project#33 - tuned moe configs v2

Enable jit for com ops

Add use-dummy-weights option

5cd8f3d

WoosukKwon merged commit ee88a7e into main Apr 9, 2023

WoosukKwon deleted the dummy branch April 9, 2023 06:36

starlitsky2010 mentioned this pull request Sep 23, 2023

killed due to memory pressure (OOM), 0 Workers crashed due to other reasons at node #1160

Closed

shanshanpt mentioned this pull request Nov 17, 2023

Run long conetxt error : CUDA error: an illegal memory access was encountered #1700

Closed

junior-zsy mentioned this pull request Nov 20, 2023

Error with 32k Long Text in chatglm2-6b-32k Model #1725

Closed

orellavie1212 mentioned this pull request Dec 11, 2023

Mixtral-8x7B-v0.1 TP 8 GPUS EDIT: TypeError: PaddedGatherOp.forward() takes 6 positional arguments but 7 were given #2022

Closed

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

Add an option to use dummy model weights (vllm-project#33)

5365aa5

ZHJ19970917 mentioned this pull request Jul 14, 2024

[Bug]: When using qwen-32b-chat-awq with multi-threaded access, errors occur after approximately several hundred visits.”vllm.engine.async_llm_engine.AsyncEngineDeadError: Background loop has errored already.“ #6421

Closed

dllehr-amd pushed a commit to dllehr-amd/vllm that referenced this pull request Jul 22, 2024

Tune fused_moe_kernel for TP 1,2,4,8 and bf16 and fp16, updated moe k…

38ada92

…ernel tuning script for rocm. Merge pull request vllm-project#33 - tuned moe configs v2

bigPYJ1151 pushed a commit to bigPYJ1151/vllm that referenced this pull request Jul 31, 2024

Merge pull request vllm-project#33 from intel-sandbox/jit_com

8fa444c

Enable jit for com ops

alixiaodi mentioned this pull request Aug 2, 2024

[Bug]: #7072

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an option to use dummy weights #33

Add an option to use dummy weights #33

WoosukKwon commented Apr 9, 2023

Add an option to use dummy weights #33

Add an option to use dummy weights #33

Conversation

WoosukKwon commented Apr 9, 2023