Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seed is not applied for DPO recipes #2335

Closed
bogdansalyp opened this issue Feb 3, 2025 · 3 comments · Fixed by #2367
Closed

Seed is not applied for DPO recipes #2335

bogdansalyp opened this issue Feb 3, 2025 · 3 comments · Fixed by #2367
Labels
bug Something isn't working triaged This issue has been assigned an owner and appropriate label

Comments

@bogdansalyp
Copy link
Contributor

bogdansalyp commented Feb 3, 2025

TL;DR

Launching same config twice with seed: 42 results in two different loss curves

Image

Affected recipes

full_dpo_distributed - seed is not set

Full DPO is taken from #2275

Image

lora_dpo_distributed - seed is not set

Image

Not affected recipes

full_finetune_distributed - works fine

Image

lora_finetune_distributed - works fine

Image
@acisseJZhong
Copy link
Contributor

Hi @bogdansalyp, could you share more about the run information, e.g. the config and run command?

@bogdansalyp
Copy link
Contributor Author

bogdansalyp commented Feb 3, 2025

Hi @bogdansalyp, could you share more about the run information, e.g. the config and run command?

Yes, it's just standard llama3_1/8B_lora_dpo.yaml config but with seed: 42 instead of seed: null

I didn't test it on other recipes yet

UPD: Updated description with the recipes I've tried

@bogdansalyp bogdansalyp changed the title Seed is not applied to some recipes Seed is not applied for DPO recipes Feb 3, 2025
@joecummings joecummings added bug Something isn't working triaged This issue has been assigned an owner and appropriate label labels Feb 4, 2025
@ebsmothers
Copy link
Contributor

@bogdansalyp agree this looks a bit weird. One observation is that the y-axis in some of the plots is really small though, I do wonder whether you also see variation of ~1e-3 in the unaffected recipes' losses? (Because I think exact numerical parity is not achievable with bf16). For debugging, would suggest inspecting two things: (1) are the model weights the same across runs? (2) are the samples seen the same across runs? If both (1) and (2) are true, my guess would be that it's just accumulated numerical error. But for (1) in the LoRA DPO recipe especially -- we don't load in LoRA weights, so those will be randomly initialized. Definitely worth checking if they're identical across runs. (It's also possible that there's another source of randomness I haven't yet accounted for.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triaged This issue has been assigned an owner and appropriate label
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants