fix padding in dpo trainer #1284

pacman100 · 2024-01-27T19:49:00Z

What does this PR do?

The padding token in DPO trainer is wrongly set to 0 when tokenizer uses a different token id for padding and tokenizer is passed. This happens because the conditional statement is checking if padding value is None but if the default is 0, it is never None.

pacman100 · 2024-01-27T19:51:44Z

HuggingFaceDocBuilderDev · 2024-01-27T19:53:04Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

younesbelkada

Thanks, in principle this looks good - I would love though a second look from @kashif - let me know what do you think and once that's good for you, feel free to merge!

kashif · 2024-01-30T07:24:28Z

great catch thanks!

fix padding in dpo trainer

b7aa293

younesbelkada approved these changes Jan 29, 2024

View reviewed changes

younesbelkada requested a review from kashif January 30, 2024 01:59

kashif approved these changes Jan 30, 2024

View reviewed changes

kashif merged commit 9186710 into huggingface:main Jan 30, 2024
6 of 9 checks passed

lapp0 pushed a commit to lapp0/trl that referenced this pull request May 10, 2024

fix padding in dpo trainer (huggingface#1284)

12f1605

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix padding in dpo trainer #1284

fix padding in dpo trainer #1284

pacman100 commented Jan 27, 2024

pacman100 commented Jan 27, 2024

HuggingFaceDocBuilderDev commented Jan 27, 2024

younesbelkada left a comment

kashif commented Jan 30, 2024

fix padding in dpo trainer #1284

fix padding in dpo trainer #1284

Conversation

pacman100 commented Jan 27, 2024

What does this PR do?

pacman100 commented Jan 27, 2024

HuggingFaceDocBuilderDev commented Jan 27, 2024

younesbelkada left a comment

Choose a reason for hiding this comment

kashif commented Jan 30, 2024