Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core] Officially Support Reward Modeling #303

Merged
merged 23 commits into from
Apr 26, 2023

Conversation

younesbelkada
Copy link
Contributor

@younesbelkada younesbelkada commented Apr 14, 2023

What does this PR do?

With Reward modeling being an important piece of PPO algorithm, it would be cool to support an "official" RewardTrainer in trl.

The RewardTrainer simply inherits from transformers.Trainer, but with some constraints. Users should be responsible to create a paired dataset that contains input_ids_j, input_ids_k, attention_mask_j, attention_mask_k, if they want to use the default RewardDataCollatorWithPadding data collator.
Also I propose to add the possibility to create the PEFT model under the hood, if a user passes a PeftConfig to the Trainer.

This PR adds a first version of it, adds also nice tests and cool documentation about that

TODO: update the README & reward_trainer.mdx file

cc @lvwerra

- add working version
- add all possible tests
- add docs
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 14, 2023

The documentation is not available anymore as the PR was closed or merged.

@younesbelkada younesbelkada requested a review from lvwerra April 14, 2023 10:57
Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks really good and clean to me. Left a few comments to try to make it a bit more user friendly.

@lvwerra
Copy link
Member

lvwerra commented Apr 17, 2023

Maybe @lewtun would also be interested to have a look to see if there is feedback from the H4 team.

@younesbelkada younesbelkada requested a review from lvwerra April 17, 2023 16:58
Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome feature @younesbelkada 🔥

I left some tiny nits and a feature request to compute accuracy by default :)


## Using the `RewardTrainer`

After standardizing your dataset, you can use the `RewardTrainer` as a classic HugingFace Trainer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe explain what format the raw dataset should have here? E.g. you could use samples of the StackExchange or Anthropic dataset (https://huggingface.co/datasets/Anthropic/hh-rlhf) as a guide

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added few lines in 4bcd96e but not sure if what I said is 100% correct, would love to have a second look here!

Copy link
Contributor Author

@younesbelkada younesbelkada Apr 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will also add an example now - EDIT added it

from peft import get_peft_model


class RewardTrainer(Trainer):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since accuracy is the most common metric for evaluating reward models, would it make sense to provide it as a default in compute_metrics? E.g. something like this should work:

    def compute_metrics(eval_pred):
        predictions, _ = eval_pred
        # Here, predictions is rewards_chosen and rewards_rejected.
        # We want to see how much of the time rewards_chosen > rewards_rejected.
        predictions = np.argmax(predictions, axis=0)
        labels = np.zeros(predictions.shape)
        return accuracy.compute(predictions=predictions, references=labels)

Copy link
Contributor Author

@younesbelkada younesbelkada Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would add evaluate as an additional dependency to the library, we can also have it as an optional dependency similar as peft! For me it's totally fine to have it as a core dependency, but l want to hear @lvwerra's opinion to make sure we are aligned on this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah true, I think having it as an optional dep would be the way to go (unless evaluate is already so light that it's deps are covered by the trl core deps)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accuracy is not a very hard metric, maybe we can just build it from scratch here :)

Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few small comments, otherwise this is good to go!

Comment on lines 85 to 86
compute_metrics (`Callable[[transformers.EvalPrediction], Dict]`):
The metrics to use for evaluation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default is accuracy.

eval_dataset,
tokenizer,
model_init,
compute_metrics,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not sure this works: if we overwrite the class method with our own metric the compute metrics in the parent class is never used, no?

what about defining a compute_accuracy function outside the class and pass if compute_metrics from the init is None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a great plan!

@younesbelkada younesbelkada requested a review from lvwerra April 26, 2023 09:42
@younesbelkada younesbelkada merged commit 3cfe194 into huggingface:main Apr 26, 2023
@younesbelkada younesbelkada deleted the add-reward-trainer branch April 26, 2023 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants