🤖 Properly unwrap torch.compile-ed models in GRPO #2750

winglian · 2025-02-03T07:31:51Z

What does this PR do?

when using torch compile, there is one more layer to unwrap before we can send the state dict to vlllm

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2025-02-03T09:06:27Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

shirinyamani · 2025-02-04T00:32:17Z

That's a very good point! Thanks for pointing it out!, but quick follow-up for my knowledge, I could not find torch.compile() in the main grpo_trainer, could you please help me understand which part of code is specifying the memory hierarchy-aware like torch.compile in your viewpoint ?

bc to the best of my knowledge we could do sth like model = torch.compile(model) but of course we gotta make sure the compatibility with rest of the computation.

@winglian @kashif

winglian · 2025-02-04T13:50:55Z

The base TrainingArguments from transformers includes a torch_compile option, so you can simply set that on GRPOConfig

That's a very good point! Thanks for pointing it out!, but quick follow-up for my knowledge, I could not find torch.compile() in the main grpo_trainer, could you please help me understand which part of code is specifying the memory hierarchy-aware like torch.compile in your viewpoint ?

bc to the best of my knowledge we could do sth like model = torch.compile(model) but of course we gotta make sure the compatibility with rest of the computation.

winglian · 2025-02-04T13:51:50Z

@qgallouedec I rebased this so the merge conflict should be resolved. thanks!

winglian · 2025-02-04T16:50:49Z

I could also move this into the unwrap_model_for_generation function, but I'm not 100% on the deepspeed behavior.

qgallouedec

Thanks @winglian, I've added a test, and made sure that it's also compatible with reward models. Can be merged once the CI is green :)

…into pr/winglian/2750

kashif approved these changes Feb 3, 2025

View reviewed changes

properly unwrap torch.compile-ed models with GRPO

302640a

winglian force-pushed the grpo-torch-compile branch from 071b10a to 302640a Compare February 4, 2025 13:48

kashif approved these changes Feb 4, 2025

View reviewed changes

add test and compat with reward models

9d571bf

qgallouedec approved these changes Feb 4, 2025

View reviewed changes

qgallouedec changed the title ~~properly unwrap torch.compile-ed models with GRPO~~ 🤖 Properly unwrap torch.compile-ed models in GRPO Feb 4, 2025

qgallouedec and others added 5 commits February 4, 2025 20:15

ignore test windows

eaf5370

properly unwrap torch.compile-ed models with GRPO

c9eef5b

add test and compat with reward models

3f6ce31

ignore test windows

cf30719

chore: lint

9684437

winglian force-pushed the grpo-torch-compile branch from eaf5370 to 9684437 Compare February 4, 2025 20:38

qgallouedec and others added 3 commits February 4, 2025 20:45

Merge branch 'grpo-torch-compile' of https://github.com/winglian/trl …

9667148

…into pr/winglian/2750

style

9154b02

Merge branch 'main' into grpo-torch-compile

2fb6951

qgallouedec merged commit bd946f9 into huggingface:main Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🤖 Properly unwrap torch.compile-ed models in GRPO #2750

🤖 Properly unwrap torch.compile-ed models in GRPO #2750

winglian commented Feb 3, 2025

HuggingFaceDocBuilderDev commented Feb 3, 2025

shirinyamani commented Feb 4, 2025

winglian commented Feb 4, 2025

winglian commented Feb 4, 2025

winglian commented Feb 4, 2025

qgallouedec left a comment

🤖 Properly unwrap torch.compile-ed models in GRPO #2750

🤖 Properly unwrap torch.compile-ed models in GRPO #2750

Conversation

winglian commented Feb 3, 2025

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Feb 3, 2025

shirinyamani commented Feb 4, 2025

winglian commented Feb 4, 2025

winglian commented Feb 4, 2025

winglian commented Feb 4, 2025

qgallouedec left a comment

Choose a reason for hiding this comment