Skip to content

Added error check to RLOO, PPOv2, OnlineDPO that ref_policy and policy have different identities #4236

Added error check to RLOO, PPOv2, OnlineDPO that ref_policy and policy have different identities

Added error check to RLOO, PPOv2, OnlineDPO that ref_policy and policy have different identities #4236