-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KTO] fix interleaving, reporting, hanging bugs #1499
Conversation
…e batch_size losses
add reference to paper Co-authored-by: lewtun <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
Co-authored-by: Kashif Rasul <[email protected]>
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@kawine are you disabling the WANDB on your side because of some local issue? |
just removed the disabling of wandb |
@kashif i reverted the changes to the tests -- are we good to merge? |
@PhilipMay appologies! yes it's my bad I believe... let me see how to attribute Clara properly. |
@kashif I am mostly fine with the version implemented here now. |
* add warning for imbalanced data * update documentation * update script commands to be same as in dpo * use batch_size KL examples and batch_size target examples to calculate batch_size losses * fix deepspeed issue * speed up forward with no_grad for KL * add some removed metrics * Update trl/trainer/kto_trainer.py * Update trl/trainer/kto_trainer.py * Update trl/trainer/kto_trainer.py add reference to paper Co-authored-by: lewtun <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * add more detailed comments * convert assert to ValueError * Update kto_trainer.py * precommit formatting * remove nans in metrics by gathering across machines * fix formatting * fix choice of mismatched examples for KL term * describe weights * fix hanging issue in distributed training * linting * move metrics to cpu * Update trl/trainer/kto_trainer.py Co-authored-by: lewtun <[email protected]> * Update trl/trainer/kto_trainer.py * Update trl/trainer/kto_trainer.py * fix tokenization error: lack of bos * change user warning for weight hyperparams * minor update to docs * reshape attention mask * reformat * add test for bos/eos tokens * move dependency location * Update tests/test_kto_trainer.py * don't report nan metrics * don't report nan metrics and remove data interleaving * fix bugs in calculating metrics * no need to gather KL term * minor changes * use nanmean for losses * remove disabling of wandb * revert changes --------- Co-authored-by: Clara Luise Pohland <[email protected]> Co-authored-by: Kashif Rasul <[email protected]> Co-authored-by: lewtun <[email protected]>
* add warning for imbalanced data * update documentation * update script commands to be same as in dpo * use batch_size KL examples and batch_size target examples to calculate batch_size losses * fix deepspeed issue * speed up forward with no_grad for KL * add some removed metrics * Update trl/trainer/kto_trainer.py * Update trl/trainer/kto_trainer.py * Update trl/trainer/kto_trainer.py add reference to paper Co-authored-by: lewtun <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * add more detailed comments * convert assert to ValueError * Update kto_trainer.py * precommit formatting * remove nans in metrics by gathering across machines * fix formatting * fix choice of mismatched examples for KL term * describe weights * fix hanging issue in distributed training * linting * move metrics to cpu * Update trl/trainer/kto_trainer.py Co-authored-by: lewtun <[email protected]> * Update trl/trainer/kto_trainer.py * Update trl/trainer/kto_trainer.py * fix tokenization error: lack of bos * change user warning for weight hyperparams * minor update to docs * reshape attention mask * reformat * add test for bos/eos tokens * move dependency location * Update tests/test_kto_trainer.py * don't report nan metrics * don't report nan metrics and remove data interleaving * fix bugs in calculating metrics * no need to gather KL term * minor changes * use nanmean for losses * remove disabling of wandb * revert changes --------- Co-authored-by: Kashif Rasul <[email protected]> Co-authored-by: lewtun <[email protected]>
@PhilipMay apologies for the confusion! i had an earlier PR #1476 that had to be closed bc it conflicted with the proposed changes in #1472, but #1472 had some separate issues with logging, which is why i created this PR (#1499). users were actively blocked because the code was hanging, so fixing that asap was high priority clara's contributions are very much appreciated and i'm happy to defer to you folks on crediting everyone properly |
* add warning for imbalanced data * update documentation * update script commands to be same as in dpo * use batch_size KL examples and batch_size target examples to calculate batch_size losses * fix deepspeed issue * speed up forward with no_grad for KL * add some removed metrics * Update trl/trainer/kto_trainer.py * Update trl/trainer/kto_trainer.py * Update trl/trainer/kto_trainer.py add reference to paper Co-authored-by: lewtun <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * Update trl/trainer/kto_trainer.py Co-authored-by: Kashif Rasul <[email protected]> * add more detailed comments * convert assert to ValueError * Update kto_trainer.py * precommit formatting * remove nans in metrics by gathering across machines * fix formatting * fix choice of mismatched examples for KL term * describe weights * fix hanging issue in distributed training * linting * move metrics to cpu * Update trl/trainer/kto_trainer.py Co-authored-by: lewtun <[email protected]> * Update trl/trainer/kto_trainer.py * Update trl/trainer/kto_trainer.py * fix tokenization error: lack of bos * change user warning for weight hyperparams * minor update to docs * reshape attention mask * reformat * add test for bos/eos tokens * move dependency location * Update tests/test_kto_trainer.py * don't report nan metrics * don't report nan metrics and remove data interleaving * fix bugs in calculating metrics * no need to gather KL term * minor changes * use nanmean for losses * remove disabling of wandb * revert changes --------- Co-authored-by: Kashif Rasul <[email protected]> Co-authored-by: lewtun <[email protected]>
cc @kashif @lewtun