adds early stopping #238

edbeeching · 2023-03-21T15:06:57Z

Adds early stopping to the PPO loop. Fixes #232

I used a value of 0.1 as the threshold as I observed an initial spike of 0.2 before we had instabilities in gpt2-xl earlier this week:

Note that RL4LM use a value of 0.5

My only other concern is gradient accumulation. I think it would it be better to zero the gradients as soon as we see a splike in KL, or to leave them in the as is. I have zeroed them for now, but it would be great to have feedback on this.

HuggingFaceDocBuilderDev · 2023-03-21T15:11:23Z

The documentation is not available anymore as the PR was closed or merged.

lvwerra

Generally looks nice and clean to me! My main worry is that this breaks logging because if we early stop the values might not all have the same shape. Did you double check that and if not could you run a quick test?

trl/trainer/ppo_config.py

edbeeching · 2023-03-22T13:32:58Z

Good point, I ran a benchmark with early stopping enabled and a threshold of 0.001 here and the logging seems to work ok.

edbeeching requested a review from lvwerra March 21, 2023 15:06

lvwerra reviewed Mar 22, 2023

View reviewed changes

trl/trainer/ppo_config.py Outdated Show resolved Hide resolved

edbeeching added 5 commits March 22, 2023 14:33

adds early stopping

65c0be2

zero opt grad

69db3b8

style

11d4b0d

Fixed typo in early stopping property description

2fcc33b

Auto stash before rebase of "origin/main"

368044f

edbeeching force-pushed the early-stopping branch from f4a1e3c to 368044f Compare March 22, 2023 13:34

lvwerra merged commit 1620da3 into main Mar 23, 2023

lvwerra deleted the early-stopping branch March 23, 2023 14:24

GauravVirmani mentioned this pull request Mar 23, 2023

PPO config __init__ is bloated #241

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adds early stopping #238

adds early stopping #238

edbeeching commented Mar 21, 2023 •

edited by younesbelkada

Loading

HuggingFaceDocBuilderDev commented Mar 21, 2023 •

edited

Loading

lvwerra left a comment

edbeeching commented Mar 22, 2023

adds early stopping #238

adds early stopping #238

Conversation

edbeeching commented Mar 21, 2023 • edited by younesbelkada Loading

HuggingFaceDocBuilderDev commented Mar 21, 2023 • edited Loading

lvwerra left a comment

Choose a reason for hiding this comment

edbeeching commented Mar 22, 2023

edbeeching commented Mar 21, 2023 •

edited by younesbelkada

Loading

HuggingFaceDocBuilderDev commented Mar 21, 2023 •

edited

Loading