-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Doc] Consistent naming of attention backends #9498
[Doc] Consistent naming of attention backends #9498
Conversation
Signed-off-by: Thomas Parnell <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for unifying that
Signed-off-by: Thomas Parnell <[email protected]>
Please take a look at the CI failure. |
Signed-off-by: Thomas Parnell <[email protected]>
Head branch was pushed to by a user without write access
@DarkLight1337 CI issues are fixed now. |
Signed-off-by: Thomas Parnell <[email protected]> Signed-off-by: charlifu <[email protected]>
Signed-off-by: Thomas Parnell <[email protected]> Signed-off-by: Vinay Damodaran <[email protected]>
Signed-off-by: Thomas Parnell <[email protected]> Signed-off-by: Alvant <[email protected]>
Signed-off-by: Thomas Parnell <[email protected]> Signed-off-by: Amit Garg <[email protected]>
Signed-off-by: Thomas Parnell <[email protected]> Signed-off-by: qishuai <[email protected]>
Signed-off-by: Thomas Parnell <[email protected]> Signed-off-by: Sumit Dubey <[email protected]>
Signed-off-by: Thomas Parnell <[email protected]>
Signed-off-by: Thomas Parnell <[email protected]> Signed-off-by: Maxime Fournioux <[email protected]>
Signed-off-by: Thomas Parnell <[email protected]> Signed-off-by: Tyler Michael Smith <[email protected]>
Right now if you try to enable an unsupported feature (e.g., multi-step with xformers) you get a message like:
I find this confusing because the names given do not match how the environment variable needs to be set in order to enable the corresponding feature (e.g.
rocm-flash-attn
vs.ROCM_FLASH
). The actual names are all-caps and defined by this enum. Those values do not match the string "names" defined in each attention backend class.This PR fixes this, so the user will be suggested a list of strings that will actually work.