Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[t5] Fix negative kl issue #262

Merged
merged 3 commits into from
Apr 14, 2023

Conversation

younesbelkada
Copy link
Contributor

@younesbelkada younesbelkada commented Mar 29, 2023

Fixes #256

This PR fixes issues related with negative KL and T5 sentiment example. The first fix was related to the sentiment script that was incorrectly ported.

Before this PR, the padding side of tokenizers was always hardcoded to left in _batched_generate. in encoder-decoder models, they should be set to their native padding_side (i.e. right for T5), as the padding is performed on the encoder tokens, and these models have been trained with this specific padding side. I think that the culprit is the way the positional attention bias is computed, that does not take into account the starting position if padding_side=left: https://github.com/huggingface/transformers/blob/main/src/transformers/models/t5/modeling_t5.py#L436-L451

Thus forcing padding_side=left for encoder-decoder models should probably be avoided, as most of Enc-dec models pad the encoder input to the right (to verify).

To illustrate the fix:

You can see that using padding_side=left led to unstable KL, whereas the proposed fix seems to lead to smoother KL

Screenshot 2023-03-29 at 17 16 25

cc @lvwerra

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Mar 29, 2023

The documentation is not available anymore as the PR was closed or merged.

if not self.is_encoder_decoder:
output = generation[(1 - mask).sum() :] # remove padding
else:
output = generation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we could remove the special token here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is correct as the token is removed here right after: https://github.com/lvwerra/trl/blob/ed87942a47f26d15e823ca7674737be02e48cc0a/trl/trainer/ppo_trainer.py#L832
Also made a run with generation[1:] and the KL becomes negative: https://wandb.ai/younesbelkada/trl/runs/vjbydeqv - so I think we shouldn't remove the special token here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that not suspicious that such a slight change breaks the training? If no then why is that expected? I'm asking this as I myself am having trouble with my t5 model and negative kl divergence even on v.0.4.1 release.

@younesbelkada younesbelkada requested a review from lvwerra March 30, 2023 12:35
@younesbelkada younesbelkada merged commit 160d0c9 into huggingface:main Apr 14, 2023
@younesbelkada younesbelkada deleted the fix-t5-neg-kl branch April 14, 2023 09:50
@avacaondata
Copy link

With other encoder-decoder models such as MarianMT models (BART architecture) I'm still experiencing negative kl, in fact it is becoming increasingly negative: -5, -19, -24.

@Opdoop
Copy link
Contributor

Opdoop commented Apr 23, 2023

I experiencing negative kl warning with Alpaca-7B on the sentiment script.

@younesbelkada
Copy link
Contributor Author

Can you try with non-batched generation as suggested in #256 (comment) and let us know if this works?

@lzy37ld
Copy link

lzy37ld commented Apr 26, 2023

In gpt2_sentiment.py, I have tried to modify the generation_kwargs a bit, like set the top_p to 0.9, and it said that my kl fall into negative and maybe because generation kwargs are not correctly set.

I wonder is there any specific restriction on the generation kwargs during ppo?

Gently pin:@younesbelkada

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

t5-sentiment example collapses on master
7 participants