Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FlashInfer] Upgrade to 0.2.0 #11194
[FlashInfer] Upgrade to 0.2.0 #11194
Changes from 10 commits
269f965
8c375a3
a62b854
b37ff55
72bdf7e
97dcedc
56798c5
706a6f6
dacb6af
06fa7cc
bc480b0
5a70aac
e0397e9
ec49257
500ff5b
bde6807
69d7c8d
d4d63dc
6e7e933
0b47067
f6e33a7
847a4d6
5b0fe64
69445cd
963aff7
ae9da66
afa377c
269e1eb
2e50ab8
0fe979d
3dd209c
bcd04fd
bb44221
5ca67ae
3c89bfb
293fdd6
5d8ad22
4d57ef9
33ff07b
ef15977
21efc67
a6b6fe8
1f13235
f17dbc3
506b641
2e476a2
95b5493
55b55d3
1f80aee
5be3783
071a68e
0e0f57f
2134e77
8e42297
b4a7992
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function can collect all
per_layer_parameter
, and only assert the results are the same.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can remember the vllm_config here by calling
get_current_vllm_config()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vllm_config.compilation_config.static_forward_context
is a dict of layer prefix to attention layer. you can collect sliding window, etc. from there. no need to iterate over model's submodule.