-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement H2O for long context inference on summarization tasks #411
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot @Kyriection for the PR!! just added some quick initial thoughts on the PR. will go deeper on the second round.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added some inline comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much @Kyriection for the PR! looking forward to our collabs in next phases.
This is add the implementation of H2O algorithm for efficient long context inference of Llama models.
Current implementations are based on the Huggingface transformers and tests on summarization tasks, including XSUM and CNN-DailyMail