reward-models

Star

Here are 7 public repositories matching this topic...

RLHFlow / RLHF-Reward-Modeling

Star

Recipes to train reward model for RLHF.

llm rlhf reward-models llama3

Updated Jan 22, 2025
Python

jackaduma / Vicuna-LoRA-RLHF-PyTorch

Star

A full pipeline to finetune Vicuna LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Vicuna architecture. Basically ChatGPT but with Vicuna

pytorch llama gpt lora finetune ppo peft vicuna llm chatgpt rlhf reward-models vicuna-7b

Updated May 20, 2024
Python

jackaduma / ChatGLM-LoRA-RLHF-PyTorch

Star

A full pipeline to finetune ChatGLM LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the ChatGLM architecture. Basically ChatGPT but with ChatGLM

pytorch llama gpt lora finetune ppo peft deepspeed llm chatgpt rlhf reward-models chatglm chatglm-6b

Updated Apr 28, 2023
Python

ExplainableML / ReNO

Star

[NeurIPS 2024] ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization

text-to-image text-to-image-generation stable-diffusion reward-models

Updated Jan 27, 2025
Python

jackaduma / Alpaca-LoRA-RLHF-PyTorch

Star

A full pipeline to finetune Alpaca LLM with LoRA and RLHF on consumer hardware. Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the Alpaca architecture. Basically ChatGPT but with Alpaca