Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cache watermark to avoid frequent cache eviction #11

Merged
merged 5 commits into from
Mar 29, 2023
Merged

Conversation

WoosukKwon
Copy link
Collaborator

@WoosukKwon WoosukKwon commented Mar 27, 2023

This PR implements the watermark mechanism to prevent frequent preemption.

If we admit new sequences such that the GPU KV cache becomes full, preemptions are highly likely to happen in the next few steps. Instead, we can reserve a small portion of the cache and refrain from utilizing the entire cache space when admitting new sequences. This will help us avoid the inefficiencies.

@WoosukKwon WoosukKwon requested a review from zhuohan123 March 28, 2023 08:16
@WoosukKwon WoosukKwon changed the title Add cache watermark to avoid frequent preemptions Add cache watermark to avoid frequent cache eviction Mar 29, 2023
@WoosukKwon
Copy link
Collaborator Author

@zhuohan123 I'm merging this PR as it does not conflict with any other and it (slightly) improves the system performance.

@WoosukKwon WoosukKwon merged commit 64e0e38 into main Mar 29, 2023
@WoosukKwon WoosukKwon deleted the watermark branch March 29, 2023 23:38
bigPYJ1151 pushed a commit to bigPYJ1151/vllm that referenced this pull request Sep 12, 2023
* add pos_encoding impl

* add benchmark and add open mp parallel
xiangyuT pushed a commit to xiangyuT/vllm that referenced this pull request Oct 25, 2023
* Comments done above worker

* format

* fixed missing arguments

* fix

* format
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
luo-cheng2021 pushed a commit to luo-cheng2021/vllm that referenced this pull request Mar 25, 2024
ykim362 pushed a commit to ykim362/vllm that referenced this pull request Jun 17, 2024
@alixiaodi alixiaodi mentioned this pull request Aug 2, 2024
zeroorhero pushed a commit to zeroorhero/vllm that referenced this pull request Sep 23, 2024
Xaenalt pushed a commit to Xaenalt/vllm that referenced this pull request Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant