fix add_special_tokens issue for data with template #1509

edixiong · 2024-04-06T00:17:12Z

Fix the add_special_tokens issue as is mentioned in this issue: #1400

younesbelkada

Thanks for the fix ! Can you slightly elaborate on why this should be the right fix? 🙏
cc @philschmid FYI

edixiong · 2024-04-09T02:39:53Z

Hi @younesbelkada , in line 267, we check if there is a template. If true, dataset_formatting.py will call apply_chat_template and the special token will be added there. In sft_trainer.py, the input will be tokenized and the special token will be added again if we do not explicitly set that add_special_tokens to False. Another workaround is to warn the user to manually set add_special_tokens to False (similar to @philschmid did).

HuggingFaceDocBuilderDev · 2024-04-12T07:49:57Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

younesbelkada

Thanks !

fix add_special_tokens issue for data with template

41edac8

younesbelkada reviewed Apr 8, 2024

View reviewed changes

younesbelkada approved these changes Apr 22, 2024

View reviewed changes

younesbelkada merged commit abc0584 into huggingface:main Apr 22, 2024
9 checks passed

kashif pushed a commit to kashif/trl that referenced this pull request Apr 23, 2024

fix add_special_tokens issue for data with template (huggingface#1509)

3bb5fca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix add_special_tokens issue for data with template #1509

fix add_special_tokens issue for data with template #1509

edixiong commented Apr 6, 2024 •

edited by younesbelkada

Loading

younesbelkada left a comment

edixiong commented Apr 9, 2024

HuggingFaceDocBuilderDev commented Apr 12, 2024

younesbelkada left a comment

fix add_special_tokens issue for data with template #1509

fix add_special_tokens issue for data with template #1509

Conversation

edixiong commented Apr 6, 2024 • edited by younesbelkada Loading

younesbelkada left a comment

Choose a reason for hiding this comment

edixiong commented Apr 9, 2024

HuggingFaceDocBuilderDev commented Apr 12, 2024

younesbelkada left a comment

Choose a reason for hiding this comment

edixiong commented Apr 6, 2024 •

edited by younesbelkada

Loading