LED #9278

patrickvonplaten · 2020-12-23T10:50:03Z

What does this PR do?

Adds LongformerEncoderDecoder (LED) from @ibeltagy - see: https://github.com/allenai/longformer#longformer

Todo:

Important: position embeddings have to be cut to correctly convert original Bart-ilke checkpoints to LED. The reason is that Bart uses a position embedding hack because of which the embedding idx 0 and 1 are never used resulting in an embedding matrix that has a length of 1026 instead of 1024, see:

transformers/src/transformers/models/bart/modeling_bart.py

Line 131 in 88ef889

# Bart is set up so that if padding_idx is specified then offset the embedding ids by 2

. All LED checkpoints are hence cut to remove this hack in LED:

model = LEDForConditionalGeneration.from_pretrained("./led-base-16384")
model.model.encoder.embed_positions.weight = torch.nn.Parameter(model.model.encoder.embed_positions.weight[2:, :])
model.model.decoder.embed_positions.weight = torch.nn.Parameter(model.model.decoder.embed_positions.weight[2:, :])
model.save_pretrained("./led-base-16384")

TODO after PR is merged:

Correctly add # Copied from .... statements from Bart and Longformer (this probably requires the Bart refactor to be merged before)
Open issue regarding problems with TF save_model test
Correct templates: delete unnecessary test for tf bart; add gradient checkpointing by default in PT

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors which may be interested in your PR.

src/transformers/models/led/modeling_led.py

patrickvonplaten · 2020-12-25T15:32:45Z

src/transformers/models/led/modeling_led.py

+        return outputs
+
+
+# Copied from transformers.models.longformer.modeling_longformer.LongformerSelfAttention with Longformer->LEDEncoder


@ibeltagy this line ensures that the whole class has to be exactly the same as the corresponding class in LongformerSelfAttention or else the tests would throw an error. This is our safety check to make sure that the original class (which in other libs would just be imported and re-used here) cannot change without this class being changed accordingly as well.

…into Led

tests/test_modeling_tf_led.py

jplu

Awesome work!!! Just left few tiny comments.

src/transformers/models/led/modeling_tf_led.py

LysandreJik

Great job implementing this! Happy to hear the model templates made your life easier, too.

Left a few nits, but LGTM!

PS: That 16384 embedding size is incredible!

docs/source/model_doc/led.rst

src/transformers/models/led/modeling_led.py

src/transformers/models/led/modeling_tf_led.py

src/transformers/models/led/modeling_led.py

src/transformers/models/led/modeling_tf_led.py

tests/test_modeling_tf_led.py

sgugger

Looks great to me! Thanks for adding this new model!

src/transformers/models/led/modeling_led.py

src/transformers/models/led/modeling_tf_led.py

src/transformers/models/led/modeling_led.py

src/transformers/models/led/modeling_tf_led.py

src/transformers/models/led/modeling_led.py

src/transformers/models/led/modeling_tf_led.py

jplu · 2021-01-05T09:36:21Z

@patrickvonplaten when you have time, can you fix the conflicts and apply the same updates merged in Longformer to LED. Thanks!

create model

47d2ad9

patrickvonplaten changed the title ~~LED~~ [WIP]LED Dec 23, 2020

patrickvonplaten changed the title ~~[WIP]LED~~ [WIP] LED Dec 23, 2020

patrickvonplaten added 10 commits December 23, 2020 13:13

add integration

cc74ee3

save current state

db88a9b

Merge remote-tracking branch 'main/master' into Led

0abfc71

make integration tests pass

ac0b2cc

add one more test

87d5267

add explanation to tests

c9d4acd

remove from bart

1a09f8a

add padding

065bf0f

remove unnecessary test

6392aae

make all tests pass

55a58bc

patrickvonplaten commented Dec 25, 2020

View reviewed changes

src/transformers/models/led/modeling_led.py Show resolved Hide resolved

patrickvonplaten added 2 commits December 25, 2020 15:20

re-add cookie cutter tests

79fc30a

finish PyTorch

7fc5f25

patrickvonplaten commented Dec 25, 2020

View reviewed changes

patrickvonplaten added 13 commits December 25, 2020 21:04

fix attention test

650775d

Update tests/test_modeling_common.py

461bb63

revert change

3a6c982

Merge branch 'Led' of https://github.com/patrickvonplaten/transformers …

1757ad4

…into Led

Merge remote-tracking branch 'main/master' into Led

bf8372e

remove unused file

61c4247

add string to doc

4b68728

save intermediate

9b8e8ee

make tf integration tests pass

cdc03b1

finish tf

65ccb49

fix doc

34976ae

fix docs again

c20a307

add led to doctree

b8fcb20

patrickvonplaten commented Dec 26, 2020

View reviewed changes

tests/test_modeling_tf_led.py Show resolved Hide resolved

add to auto tokenizer

fe035c4

patrickvonplaten changed the title ~~[WIP] LED~~ LED Dec 27, 2020

patrickvonplaten added 2 commits December 28, 2020 14:01

added tips for led

4b56fa1

make style

06a35e0

jplu approved these changes Dec 28, 2020

View reviewed changes

patrickvonplaten added 2 commits December 28, 2020 15:30

apply jplus statements

282bcf5

correct tf longformer

5bbc9fa

LysandreJik approved these changes Jan 4, 2021

View reviewed changes

patil-suraj mentioned this pull request Jan 4, 2021

How to use Longformer for summarization #9399

Closed

patrickvonplaten requested a review from sgugger January 4, 2021 16:42

apply lysandres suggestions

3c6e2eb

sgugger approved these changes Jan 4, 2021

View reviewed changes

src/transformers/models/led/modeling_led.py Outdated Show resolved Hide resolved

src/transformers/models/led/modeling_tf_led.py Outdated Show resolved Hide resolved

apply sylvains suggestions

3b4b354

sgugger reviewed Jan 4, 2021

View reviewed changes

src/transformers/models/led/modeling_led.py Outdated Show resolved Hide resolved

src/transformers/models/led/modeling_tf_led.py Outdated Show resolved Hide resolved

patrickvonplaten commented Jan 4, 2021

View reviewed changes

src/transformers/models/led/modeling_led.py Outdated Show resolved Hide resolved

patrickvonplaten commented Jan 4, 2021

View reviewed changes

src/transformers/models/led/modeling_tf_led.py Outdated Show resolved Hide resolved

Apply suggestions from code review

bca9fef

merge conflicts

bb1ac11

patrickvonplaten merged commit 189387e into huggingface:master Jan 5, 2021

patrickvonplaten deleted the Led branch January 5, 2021 12:14

patrickvonplaten added the PR for Model Addition label Jan 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LED #9278

LED #9278

patrickvonplaten commented Dec 23, 2020 •

edited

Loading

patrickvonplaten Dec 25, 2020

jplu left a comment

LysandreJik left a comment

sgugger left a comment

jplu commented Jan 5, 2021

		return outputs


		# Copied from transformers.models.longformer.modeling_longformer.LongformerSelfAttention with Longformer->LEDEncoder

LED #9278

LED #9278

Conversation

patrickvonplaten commented Dec 23, 2020 • edited Loading

What does this PR do?

TODO after PR is merged:

Who can review?

patrickvonplaten Dec 25, 2020

Choose a reason for hiding this comment

jplu left a comment

Choose a reason for hiding this comment

LysandreJik left a comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

jplu commented Jan 5, 2021

patrickvonplaten commented Dec 23, 2020 •

edited

Loading