generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Issues: huggingface/trl
[Tracking issue] Integrate native liger-kernel losses
#2495
opened Dec 17, 2024 by
qgallouedec
Open
4
[Tracking issue] Wrong loss scaling when accumulating gradient
#2617
opened Jan 23, 2025 by
qgallouedec
Open
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Training with grpo required 20 mins for single step
✨ enhancement
New feature or request
#2751
opened Feb 3, 2025 by
imrankh46
GRPOTrainer with Deepspeed: Getting device mismatch error
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
🏋 GRPO
Related to GRPO
#2745
opened Feb 3, 2025 by
3rdAT
5 tasks done
GRPOTrainer should offer separate LLMs for advantage estimation and training
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2744
opened Feb 2, 2025 by
hallerite
feat(GRPOTrainer): New feature or request
🏋 GRPO
Related to GRPO
reward_func
return None
to skip
✨ enhancement
#2737
opened Feb 2, 2025 by
ctjlewis
PLZ make padding_free for New feature or request
🏋 GKD
Related to GKD
🙋 help from community wanted
Open invitation for community members to contribute
DataCollatorForChatML
.
✨ enhancement
#2736
opened Feb 2, 2025 by
YooSungHyun
SFTvsRL SFT Memorizes, RL Generalizes
✨ enhancement
New feature or request
#2735
opened Feb 2, 2025 by
NickyDark1
GRPO Trainer supports VLMs
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2734
opened Feb 2, 2025 by
sunildkumar
GKD Example why do not use labels?
🏋 GKD
Related to GKD
❓ question
Seeking clarification or more information
#2732
opened Feb 2, 2025 by
YooSungHyun
5 tasks done
Latest TRL code = significantly worse rewards for GRPO training
🐛 bug
Something isn't working
🏋 GRPO
Related to GRPO
#2731
opened Feb 2, 2025 by
abacaj
5 tasks done
OOM for 7B model on A100 80Gb
🐛 bug
Something isn't working
#2719
opened Jan 31, 2025 by
JohnConnor123
5 tasks done
AttributeError: 'AutoModelForCausalLMWithValueHead' object has no attribute 'base_model_prefix'
🐛 bug
Something isn't working
⚡ PEFT
Related to PEFT
🏋 PPO
Related to PPO
#2718
opened Jan 31, 2025 by
Tarak200
GRPO for RL on agent trajectories
🏋 GRPO
Related to GRPO
🏋 Reward
Related to Reward modelling
#2715
opened Jan 31, 2025 by
korbinian-hoermann
Isn't the reward *minimized* when len(completion)==20 if this is the reward function?
🏋 Reward
Related to Reward modelling
#2714
opened Jan 31, 2025 by
cfpark00
GRPO with tool calling
🏋 GRPO
Related to GRPO
🏋 Reward
Related to Reward modelling
#2712
opened Jan 31, 2025 by
accupham
3 tasks
LoRA 'trainable params: 0'
🐛 bug
Something isn't working
⚡ PEFT
Related to PEFT
#2711
opened Jan 31, 2025 by
shannonruxin
Examples in training VDPO on llava1.6
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
#2710
opened Jan 31, 2025 by
lucasjinreal
PPOTrainer + LoRA and Continued Training
⏳ needs more info
Additional information or clarification is required to proceed
⚡ PEFT
Related to PEFT
🏋 PPO
Related to PPO
#2707
opened Jan 30, 2025 by
kooryan
Multi-GPU sampling for vLLM in GRPO Trainer
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2706
opened Jan 30, 2025 by
nch0w
GRPO: Why does loss start at 0 for first K steps and then increase over time?
🏋 GRPO
Related to GRPO
❓ question
Seeking clarification or more information
#2703
opened Jan 30, 2025 by
arnavgarg1
5 tasks done
Exposing GenerationConfig in the GRPO Trainer
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2702
opened Jan 30, 2025 by
Superskyyy
Allow pretokenized dataset in GRPO Trainer
✨ enhancement
New feature or request
🏋 GRPO
Related to GRPO
#2701
opened Jan 30, 2025 by
Superskyyy
GRPO VLLM does not work with Lora
🏋 GRPO
Related to GRPO
⚡ PEFT
Related to PEFT
#2698
opened Jan 30, 2025 by
gagan3012
5 tasks done
I cannot launch PPOTrainning script with accelerate launch
⚡accelerate
Related to accelerate
⚡ PEFT
Related to PEFT
🏋 PPO
Related to PPO
#2696
opened Jan 30, 2025 by
daehuikim
5 tasks done
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.