Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing head_mask and decoder_head_mask arguments in encoder-decoder models #9814

Closed
stancld opened this issue Jan 26, 2021 · 0 comments · Fixed by #9819, #9856, #9964 or #9988
Closed

Missing head_mask and decoder_head_mask arguments in encoder-decoder models #9814

stancld opened this issue Jan 26, 2021 · 0 comments · Fixed by #9819, #9856, #9964 or #9988

Comments

@stancld
Copy link
Contributor

stancld commented Jan 26, 2021

🚀 Feature request

Following the PRs #9569, #9634 and #9639, there are other encoder-decoder models, which either do not support head_mask and decoder_head_mask input arguments at all or can be only provided with a single head_mask argument used for head masking both in encoder and decoder. It would be, therefore, nice to make this feature uniform over all the decoder-models.


Models:

Model Pytorch TensorFlow PR Copy dependency
BERTGeneration ☑️ ✖️ - -
EncoderDecoderModel ☑️ ✖️ - -
FSMT ✖️ #9819 -
LED ☑️ PT - #9856 ; TF - #9988 -
ProphetNet ☑️ ✖️ #9964 -
Longformer ☑️ PT - #9856; TF - #9988 LED

Your contribution

I'm happy to add this feature in the following days, both for PyTorch and TensorFlow models. (Likely in shorter PRs in order not to create large, overwhelming PRs)


Reviewers: @patrickvonplaten, @jplu, @sgugger, @LysandreJik, @stas00 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment