[`core`] officially support SFT (Supervised Finetuning) #323

younesbelkada · 2023-04-26T13:26:22Z

What does this PR do?

This PR introduces SFTTrainer class. A handy and easy to use class to train your supervised fine-tuned model on instruction-based datasets.
The API is easy to use, and also modular enough if you want to customize your training for advanced users.

You just need to pass a model id, optionally pass a PeftConfig to train adapters only. Pass also from_pretrained kwargs directly to SFTTrainer for advanced users, for example to load your model in 8bit mode.

The PR also introduces ConstantLengthDataset, a handy class to create instruction-based datasets. Just pass a tokenizer, a dataset, and a function to specify the formatting you want to have, and you should be good to go

Quickstart:

from transformers import AutoModelForCausalLM
from datasets import load_dataset
from trl import SFTTrainer

dataset = load_dataset("imdb", split="train")

model = AutoModelForCausalLM.from_pretrained("facebook/opt-350m")

trainer = SFTTrainer(
    model,
    train_dataset=dataset,
    dataset_text_field="text",
)

trainer.train()

cc @lvwerra

HuggingFaceDocBuilderDev · 2023-04-26T13:33:39Z

The documentation is not available anymore as the PR was closed or merged.

lvwerra

Getting into great shape! Left a few comments.

docs/source/sft_trainer.mdx

trl/trainer/utils.py

trl/trainer/sft_trainer.py

lvwerra · 2023-04-28T10:55:49Z

trl/trainer/sft_trainer.py

+        peft_config: Optional[Dict] = None,
+        dataset_text_field: Optional[str] = None,
+        packing: Optional[bool] = True,
+        dataset_kwargs: Optional[Dict] = {},


i would avoid kwargs fields as much as possible. what values can be passed here? can't have them as separate kwargs?

The dataset_kwargs corresponds to the kwargs of ConstantLengthDataset, there are 6 optional of them, I think we should move them to proper kwarg as we can always modify that class, but for prepare_int8_training kwargs I would maybe keep them as they are

Or maybe we should educate users to create models outside the trainer in case they want to have full control over that function and remove prepare_int8_training_kwargs. Wdyt?

trl/trainer/sft_trainer.py

lvwerra · 2023-04-28T11:00:05Z

trl/trainer/sft_trainer.py

+
+    def _prepare_non_packed_dataloader(self, tokenizer, dataset, dataset_text_field, data_collator, max_seq_len):
+        # tokenize the dataset
+        dataset = dataset.map(


I think you can just tokenize the dataset so you have the input ids. i would not pad at all (the collator will do this, and if a batch has elements that are all shorter it will be faster).

then you can just pass the tokenized dataset along the data collator (data_collator = DataCollatorForLanguageModeling(tokenizer, mlm=False))

we did something similar here: https://huggingface.co/learn/nlp-course/chapter7/6?fw=pt

then we can maybe just call the function _tokenize_dataset

Thanks for the great pointer!

trl/trainer/sft_trainer.py

Co-authored-by: Leandro von Werra <[email protected]>

lewtun

Thanks for adding this sweet feature @younesbelkada 🔥 !

I've left a few questions and suggestions for things that could help improve user understanding. I'll let the core maintainer approve this :)

docs/source/_toctree.yml

docs/source/sft_trainer.mdx

trl/trainer/sft_trainer.py

lewtun · 2023-05-01T12:39:27Z

trl/trainer/sft_trainer.py

+            Whether to use an infinite dataset or not. Defaults to `False`.
+        num_of_sequences (`Optional[int]`):
+            The number of sequences to use for the `ConstantLengthDataset`. Defaults to `1024`.
+        chars_per_token (`Optional[float]`):


Does trl provide a helper method for this? If yes, it would be nice to see a small example in the docs of how this dataset works

I added a pointer to an example that uses this in 91b2643

Co-authored-by: lewtun <[email protected]>

docs/source/sft_trainer.mdx

lvwerra · 2023-05-02T10:21:18Z

trl/trainer/sft_trainer.py

+        num_of_sequences: Optional[int] = 1024,
+        chars_per_token: Optional[float] = 3.6,
+        prepare_in_int8_kwargs: Optional[Dict] = {},
+        **pretrained_kwargs,


I am slightly opposed to pass kwargs like that. I think if people want to use something other than default they should just load the model outside. It's just one additional line of code for them. We on the other hand need to then worry about those: e.g. if someone has a typo in one of the above kwargs they will get a weird error because it's passed to the model.

What do you think?

Yeah totally aligned on this!

trl/trainer/sft_trainer.py

Co-authored-by: Leandro von Werra <[email protected]>

…into add-sft-trainer

tigerinus · 2023-08-10T01:50:33Z

The IMDB dataset is a sentence labelled by 1 or 0 indicating whether the sentence is positive feedback or negative feedback.

Question is, does this SFTTrainer train against the labels or the context of each sentence?

lvwerra · 2023-08-10T08:04:41Z

No, the SFTTrainer only trains on the text with the causal language modeling objective.

hezhiyang2000 · 2023-09-02T15:08:56Z

Is this function suitable to train a instruction-following LLM? I reviewed the code and can't find the code about labels that avoiding to calculate the loss of instruction, prompts and labels.

lvwerra · 2023-09-04T10:05:50Z

See the docs here: https://huggingface.co/docs/trl/sft_trainer#train-on-completions-only

younesbelkada added 3 commits April 26, 2023 13:21

add v1

aeb2d82

revert

1ff149e

correct filename

7ece9c1

add tests and final tweaks

027caa4

younesbelkada marked this pull request as ready for review April 26, 2023 15:34

younesbelkada requested review from lvwerra and lewtun April 26, 2023 15:34

younesbelkada added 2 commits April 26, 2023 16:04

fix tests

f758d36

adapt from offline suggestions

d6c6787

lvwerra reviewed Apr 28, 2023

View reviewed changes

younesbelkada and others added 8 commits April 28, 2023 13:22

Update trl/trainer/sft_trainer.py

e5fe16c

Co-authored-by: Leandro von Werra <[email protected]>

fixes

9b788d9

remove warning

dfe57e3

multiple fixes

75317c5

fixes

51e4428

fix

89feb5d

final fixes

af3d9ea

final fix

ff3019b

younesbelkada requested a review from lvwerra April 28, 2023 14:08

more clarification

a370eb1

lewtun reviewed May 1, 2023

View reviewed changes

younesbelkada and others added 8 commits May 2, 2023 10:03

Apply suggestions from code review

c156abf

Co-authored-by: lewtun <[email protected]>

add test

1f86b70

add arg

90eabdd

add callback instructions

6097010

add formatting_prompts_func

dff614c

try docs

7befd9e

add CLD

85861db

fix docstrings

91b2643

format

a541681

lvwerra reviewed May 2, 2023

View reviewed changes

younesbelkada and others added 9 commits May 2, 2023 13:46

Update docs/source/sft_trainer.mdx

1bf337f

Co-authored-by: Leandro von Werra <[email protected]>

remove prepare_in_int8_kwargs

236bab8

Merge branch 'add-sft-trainer' of https://github.com/younesbelkada/trl …

8764873

…into add-sft-trainer

change return_overflowing_tokens

4415c20

add warnings

710c998

address comments

a50b286

revert pretrained kwargs

6092e29

quality

f86269e

fix sft script

5403dcd

younesbelkada requested a review from lvwerra May 3, 2023 08:29

lvwerra approved these changes May 3, 2023

View reviewed changes

younesbelkada merged commit c60fd91 into huggingface:main May 3, 2023

younesbelkada deleted the add-sft-trainer branch May 3, 2023 08:42

younesbelkada mentioned this pull request May 3, 2023

[SFT] Fix sft issues #336

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[`core`] officially support SFT (Supervised Finetuning) #323

[`core`] officially support SFT (Supervised Finetuning) #323

younesbelkada commented Apr 26, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 26, 2023 •

edited

Loading

lvwerra left a comment

lvwerra Apr 28, 2023

younesbelkada Apr 28, 2023

younesbelkada Apr 28, 2023

lvwerra Apr 28, 2023

younesbelkada Apr 28, 2023

lewtun left a comment

lewtun May 1, 2023

younesbelkada May 2, 2023

lvwerra May 2, 2023

younesbelkada May 2, 2023

tigerinus commented Aug 10, 2023

lvwerra commented Aug 10, 2023

hezhiyang2000 commented Sep 2, 2023

lvwerra commented Sep 4, 2023

[core] officially support SFT (Supervised Finetuning) #323

[core] officially support SFT (Supervised Finetuning) #323

Conversation

younesbelkada commented Apr 26, 2023 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Apr 26, 2023 • edited Loading

lvwerra left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lewtun left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tigerinus commented Aug 10, 2023

lvwerra commented Aug 10, 2023

hezhiyang2000 commented Sep 2, 2023

lvwerra commented Sep 4, 2023

[`core`] officially support SFT (Supervised Finetuning) #323

[`core`] officially support SFT (Supervised Finetuning) #323

younesbelkada commented Apr 26, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 26, 2023 •

edited

Loading