-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🤖 Properly unwrap torch.compile-ed models in GRPO #2750
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
That's a very good point! Thanks for pointing it out!, but quick follow-up for my knowledge, I could not find bc to the best of my knowledge we could do sth like |
071b10a
to
302640a
Compare
The base
|
@qgallouedec I rebased this so the merge conflict should be resolved. thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @winglian, I've added a test, and made sure that it's also compatible with reward models. Can be merged once the CI is green :)
eaf5370
to
9684437
Compare
What does this PR do?
when using torch compile, there is one more layer to unwrap before we can send the state dict to vlllm
Fixes # (issue)
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.