add create_reference_model #61

lvwerra · 2022-12-29T12:13:48Z

This PR adds the create_reference_model function. It can be used to create a static reference model from an existing model:

ref_model = create_reference_model(model)

The reference model can also share layers with the original model:

ref_model = create_reference_model(model, share_layers=3)

In that case the first three layers are frozen for both models and the remaining layers can be updated in the active model.

The layers are identified via string matching of their names. This works for GPT2/BLOOM/OPT/GPT-neo. If a custom pattern is necessary one could use the pattern keyword.

younesbelkada

Thanks a bunch for working on this 🔥 Can't wait to try it on the trainer! Left few minor comments :D

tests/test_modeling.py

younesbelkada · 2022-12-29T12:31:20Z

trl/models/modeling_base.py

+    """
+
+    parameter_names = [n for n, _ in model.named_parameters()]
+    ref_model = deepcopy(model)


Do you think this can blow up the memory if manipulating very large models?
I was thinking of leveraging accelerate.init_empty_weights context manager and initialize an empty model and populating it step by step. This can be boilerplate a bit so maybe let's leave it in a follow up PR

Sounds good - let's see how it works and maybe adapt it to init_empty_weights in a follow up :)

Co-authored-by: Younes Belkada <[email protected]>

lewtun

Nice feature @lvwerra 🚀 ! I left some nits and a question about how the indexing of share_layers is counted :)

trl/models/modeling_base.py

lewtun · 2022-12-30T00:05:30Z

trl/models/modeling_base.py

+    ref_model = deepcopy(model)
+
+    # if no layers are shared, return copy of model
+    if share_layers is None:


do we need to extend this logic to handle the case when share_layers=0 or does the indexing start with 0?

indeed indexing starts at 0 so this should be fine

Co-authored-by: lewtun <[email protected]>

HuggingFaceDocBuilderDev · 2022-12-30T08:42:02Z

The documentation is not available anymore as the PR was closed or merged.

add create_reference_model

e020c51

lvwerra requested review from lewtun and younesbelkada December 29, 2022 12:13

younesbelkada approved these changes Dec 29, 2022

View reviewed changes

Update tests/test_modeling.py

3630935

Co-authored-by: Younes Belkada <[email protected]>

younesbelkada mentioned this pull request Dec 29, 2022

Roadmap - trl 0.2 #64

Closed

26 tasks

lewtun approved these changes Dec 30, 2022

View reviewed changes

lvwerra and others added 3 commits December 30, 2022 09:32

Apply suggestions from code review

137ef00

Co-authored-by: lewtun <[email protected]>

rename variable

f698f7b

Merge branch 'master' into multi-body-models

bbc2b08

lvwerra merged commit f78d830 into master Dec 30, 2022

lewtun deleted the multi-body-models branch January 27, 2023 15:21

PRASANTH-1427 mentioned this pull request Sep 13, 2023

getting error while running the sft_llama2.py #762

Closed

August-murr mentioned this pull request Jan 6, 2025

onlinedpo error when use deepspeed zero3 August-murr/trl#7

Open

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add create_reference_model #61

add create_reference_model #61

lvwerra commented Dec 29, 2022

younesbelkada left a comment •

edited

Loading

younesbelkada Dec 29, 2022

lvwerra Dec 29, 2022

lewtun left a comment

lewtun Dec 30, 2022

lvwerra Dec 30, 2022

HuggingFaceDocBuilderDev commented Dec 30, 2022 •

edited

Loading

add create_reference_model #61

add create_reference_model #61

Conversation

lvwerra commented Dec 29, 2022

younesbelkada left a comment • edited Loading

Choose a reason for hiding this comment

younesbelkada Dec 29, 2022

Choose a reason for hiding this comment

lvwerra Dec 29, 2022

Choose a reason for hiding this comment

lewtun left a comment

Choose a reason for hiding this comment

lewtun Dec 30, 2022

Choose a reason for hiding this comment

lvwerra Dec 30, 2022

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Dec 30, 2022 • edited Loading

younesbelkada left a comment •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 30, 2022 •

edited

Loading