Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement prefix sharing #37

Closed
wants to merge 4 commits into from
Closed

Implement prefix sharing #37

wants to merge 4 commits into from

Conversation

WoosukKwon
Copy link
Collaborator

No description provided.

@WoosukKwon WoosukKwon closed this Jun 8, 2023
@WoosukKwon WoosukKwon deleted the prefix branch June 18, 2023 07:24
tianyil1 pushed a commit to tianyil1/vllm that referenced this pull request Jun 5, 2024
joerunde added a commit to joerunde/vllm that referenced this pull request Jun 17, 2024
These are some changes we've been adding onto the last couple releases.
Would be nice to have on main for the next round, and hopefully this is
all moot once we hop over to the ODH fork anyway

---------

Signed-off-by: Travis Johnson <[email protected]>
Signed-off-by: Joe Runde <[email protected]>
Co-authored-by: Travis Johnson <[email protected]>
yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024
Tested by making sure magic_wand was uninstalled and this code for a
dense model runs fine:
```python
from vllm import LLM, SamplingParams
model = LLM("nm-testing/opt-125m-pruned2.4", enforce_eager=True)
```

Then testing with a sparse model run:
```python
from vllm import LLM, SamplingParams
model = LLM("nm-testing/opt-125m-pruned2.4", sparsity="sparse_w16a16", enforce_eager=True)
```
output:
```
...
  File "/home/michael/code/neuralmagic-vllm/vllm/model_executor/weight_utils.py", line 93, in get_sparse_config
    from vllm.model_executor.layers.sparsity import get_sparsity_config
  File "/home/michael/code/neuralmagic-vllm/vllm/model_executor/layers/sparsity/__init__.py", line 6, in <module>
    raise ValueError(
ValueError: magic_wand is not available and required for sparsity support. Please install it with `pip install magic_wand`
```
yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024
Tested by making sure magic_wand was uninstalled and this code for a
dense model runs fine:
```python
from vllm import LLM, SamplingParams
model = LLM("nm-testing/opt-125m-pruned2.4", enforce_eager=True)
```

Then testing with a sparse model run:
```python
from vllm import LLM, SamplingParams
model = LLM("nm-testing/opt-125m-pruned2.4", sparsity="sparse_w16a16", enforce_eager=True)
```
output:
```
...
  File "/home/michael/code/neuralmagic-vllm/vllm/model_executor/weight_utils.py", line 93, in get_sparse_config
    from vllm.model_executor.layers.sparsity import get_sparsity_config
  File "/home/michael/code/neuralmagic-vllm/vllm/model_executor/layers/sparsity/__init__.py", line 6, in <module>
    raise ValueError(
ValueError: magic_wand is not available and required for sparsity support. Please install it with `pip install magic_wand`
```
bigPYJ1151 pushed a commit to bigPYJ1151/vllm that referenced this pull request Jul 30, 2024
@alixiaodi alixiaodi mentioned this pull request Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant