-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Core] Optimize block_manager_v2 vs block_manager_v1 (to make V2 default) #5602
Merged
cadedaniel
merged 48 commits into
vllm-project:main
from
neuralmagic:block_manager_v2_perf
Jul 2, 2024
Merged
Changes from all commits
Commits
Show all changes
48 commits
Select commit
Hold shift + click to select a range
007b32d
Optimize block_manager_v2 so it becomes the default
alexm-redhat ea94e85
cleanups
alexm-redhat e21c410
refactor code so that only free() is used
alexm-redhat b5872d2
prefix_caching: refactor self._blocks to tracked blocks
alexm-redhat 54d76ba
format
alexm-redhat 0aecdb2
cpu bug fix
alexm-redhat d649055
fixes
alexm-redhat 92550b0
fixes
alexm-redhat 4100268
fix immutable promotion
alexm-redhat 8812380
fix last access bug
alexm-redhat 5d12f4f
format
alexm-redhat ae3dde4
revert offline_inference.py
alexm-redhat 4f29bff
fixes
alexm-redhat 8f8fb66
Cade review comments for fixing append_token_ids
alexm-redhat 7a52d34
cleanup pass
alexm-redhat 6a2b897
sync
alexm-redhat 73c13be
fix test_block_table.py
alexm-redhat 0bbc049
fix swap_in/swap_out to actually replace the blocks
alexm-redhat c9c7070
fixes
alexm-redhat fd802b0
fix block table tests
alexm-redhat 5a48c1e
ping
alexm-redhat 1ffa4bf
tmp towards cow/promo back in block
alexm-redhat 38e8e21
sync
alexm-redhat 994e972
works
alexm-redhat ad83158
format
alexm-redhat a594be3
sync
alexm-redhat e72cd50
review fixes from Cade
alexm-redhat 87d14e2
cleanup
alexm-redhat 80e6ab1
test fixes
alexm-redhat 6dcd304
fix sequence prompt/output token ids to be updated properly with the …
alexm-redhat 6c410d8
fix a block_table bug
alexm-redhat f97cffa
fix the num_token_ids bug by separating num_token_ids and num_tokens_…
alexm-redhat b74d834
Refactor back token_ids based on Cade comments.
alexm-redhat 179542b
use tuples for seq_data prompt/output token_ids
alexm-redhat 7c0ce65
sync
alexm-redhat 4dd957e
fix
alexm-redhat 325226f
fix tests
alexm-redhat 29e9683
fix tests
alexm-redhat c36f353
add Antoni's idea for improving caching of computed block ids by usin…
alexm-redhat d0b2ef9
Based on Cade comment, refactored the seq last_access and cached comp…
alexm-redhat bd65468
cleanup
alexm-redhat 3064208
Cade's comments
alexm-redhat 2236d5e
fix test
alexm-redhat 4ea6938
fix fork_seq
alexm-redhat 82b31e8
ping
alexm-redhat 3f1c2a1
ping2
alexm-redhat 2ff442d
Cade's comments
alexm-redhat 3322f8c
more Cade commants
alexm-redhat File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO(cade) see why this api is no longer required