Added error check to RLOO, PPOv2, OnlineDPO that ref_policy
and policy
have different identities#2057
Merged
qgallouedec merged 9 commits intohuggingface:mainfrom RylanSchaeffer:mainSep 17, 2024
+119-1
Commits
Commits on Sep 11, 2024
Commits on Sep 13, 2024
Commits on Sep 17, 2024
- committed
- committed
- committed
- committed
- committed
- committed
- committed