-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LED #9278
LED #9278
Conversation
return outputs | ||
|
||
|
||
# Copied from transformers.models.longformer.modeling_longformer.LongformerSelfAttention with Longformer->LEDEncoder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ibeltagy this line ensures that the whole class has to be exactly the same as the corresponding class in LongformerSelfAttention
or else the tests would throw an error. This is our safety check to make sure that the original class (which in other libs would just be imported and re-used here) cannot change without this class being changed accordingly as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work!!! Just left few tiny comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job implementing this! Happy to hear the model templates made your life easier, too.
Left a few nits, but LGTM!
PS: That 16384 embedding size is incredible!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me! Thanks for adding this new model!
@patrickvonplaten when you have time, can you fix the conflicts and apply the same updates merged in Longformer to LED. Thanks! |
What does this PR do?
Adds LongformerEncoderDecoder (LED) from @ibeltagy - see: https://github.com/allenai/longformer#longformer
Todo:
transformers/src/transformers/models/bart/modeling_bart.py
Line 131 in 88ef889
LEDIntegrationTests
intests/test_modeling_led.py
.TODO after PR is merged:
# Copied from ....
statements from Bart and Longformer (this probably requires the Bart refactor to be merged before)Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors which may be interested in your PR.