diff --git a/README.md b/README.md index 5b03d56a5d..4fa0849ca2 100644 --- a/README.md +++ b/README.md @@ -3,7 +3,7 @@ ## What is it? -With `trl` you can train transformer language models with Proximal Policy Optimization (PPO). The library is built on top of the [`transformer`](https://github.com/huggingface/transformers) library by 🤗 Hugging Face. Therefore, pre-trained language models can be directly loaded via `transformers`. At this point only decoder architectures such as GPT2 are implemented. +With `trl` you can train transformer language models with Proximal Policy Optimization (PPO). The library is built on top of the [`transformers`](https://github.com/huggingface/transformers) library by 🤗 Hugging Face. Therefore, pre-trained language models can be directly loaded via `transformers`. At this point only decoder architectures such as GPT2 are implemented. **Highlights:** - PPOTrainer: A PPO trainer for language models that just needs (query, response, reward) triplets to optimise the language model.