Fix PPOTrainer README example #1441

nikihowe · 2024-03-19T02:41:35Z

The current PPOTrainer example given in the main README does not work for two reasons: 1) pad token, 2) relationship between batch size, minibatch size, and gradient accumulation steps.

This PR addresses these two issues :).

Fixes: #1442

younesbelkada

Thanks for fixing the PPOTrainer README example ! 🚀

HuggingFaceDocBuilderDev · 2024-03-19T09:21:07Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

* Fix example * Delete newline

nikihowe added 2 commits March 18, 2024 22:39

Fix example

3de4677

Delete newline

82e1849

younesbelkada approved these changes Mar 19, 2024

View reviewed changes

younesbelkada merged commit abc7301 into huggingface:main Mar 19, 2024
1 check passed

nikihowe mentioned this pull request Mar 20, 2024

This PPOTrainer example code fails to run #1442

Closed

lapp0 pushed a commit to lapp0/trl that referenced this pull request May 10, 2024

Fix PPOTrainer README example (huggingface#1441)

1a9e71f

* Fix example * Delete newline

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix PPOTrainer README example #1441

Fix PPOTrainer README example #1441

nikihowe commented Mar 19, 2024 •

edited by younesbelkada

Loading

younesbelkada left a comment

HuggingFaceDocBuilderDev commented Mar 19, 2024

Fix PPOTrainer README example #1441

Fix PPOTrainer README example #1441

Conversation

nikihowe commented Mar 19, 2024 • edited by younesbelkada Loading

younesbelkada left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Mar 19, 2024

nikihowe commented Mar 19, 2024 •

edited by younesbelkada

Loading