-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit position embeddings in inference #1598
Limit position embeddings in inference #1598
Conversation
…rom (huggingface#57) model code. Co-authored-by: Adam Stachowicz <[email protected]>
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@bhargaveede Can you also address the unsolved comments in #1501 please? At least moving parallel_state.py
to optimum/habana/distributed
. Using more recent versions of Llama for CI can be done after release, but moving parallel_state.py
should be done before release. Otherwise that will be a breaking change in the release after and we should avoid that. It's just about moving one file and updating a couple of imports.
Co-authored-by: Adam Stachowicz <[email protected]>
Co-authored-by: Adam Stachowicz <[email protected]>
Co-authored-by: Adam Stachowicz <[email protected]>
Moving the max_position_embeddings truncation from model code as it hampers with training.
Moved it to text-generation/utils.py as that's better place
Fixes # (issue)
Before submitting