Unexpected warning of SlowTokenizer #29237

hiyouga · 2024-02-23T09:30:36Z

System Info

transformers version: 4.38.1
Platform: Linux-5.15.0-91-generic-x86_64-with-glibc2.35
Python version: 3.10.10
Huggingface_hub version: 0.19.4
Safetensors version: 0.4.1
Accelerate version: 0.26.1
Tokenizers version: 0.15.2
PyTorch version (GPU?): 2.1.1+cu121 (True)

Who can help?

@ArthurZucker @younesbelkada

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

import transformers
from transformers import AutoTokenizer
transformers.utils.logging.set_verbosity(transformers.logging.INFO)
transformers.utils.logging.enable_default_handler()
transformers.utils.logging.enable_explicit_format()
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-2-7b-hf", use_fast=False)
# [INFO|tokenization_utils_base.py:2044] 2024-02-23 17:26:31,129 >> loading file tokenizer.model
# [INFO|tokenization_utils_base.py:2044] 2024-02-23 17:26:31,130 >> loading file added_tokens.json
# [INFO|tokenization_utils_base.py:2044] 2024-02-23 17:26:31,130 >> loading file special_tokens_map.json
# [INFO|tokenization_utils_base.py:2044] 2024-02-23 17:26:31,130 >> loading file tokenizer_config.json
# [INFO|tokenization_utils_base.py:2044] 2024-02-23 17:26:31,130 >> loading file tokenizer.json
tokenizer.encode("hello")
# [WARNING|tokenization_utils.py:562] 2024-02-23 17:26:41,845 >> Keyword arguments {'add_special_tokens': False} not recognized.
# [1, 22172]
tokenizer.encode("hello", add_special_tokens=False)
# [WARNING|tokenization_utils.py:562] 2024-02-23 17:26:58,903 >> Keyword arguments {'add_special_tokens': False} not recognized.
# [22172]

Expected behavior

Do not show redundant warnings like the fast one.

The text was updated successfully, but these errors were encountered:

ArthurZucker · 2024-02-23T09:33:59Z

Thanks! That is indeed an issue and should be fixed! I tracked it but could not reproduced 🤗 Thanks

StevenTang1998 · 2024-02-24T03:53:39Z

The same issue

hiyouga · 2024-02-25T06:40:43Z

@ArthurZucker This issue is a duplicate of #29237 and can be fixed in #29278

github-actions · 2024-03-25T08:03:55Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

ArthurZucker · 2024-03-25T10:08:29Z

For context, #29346 fixed this, thanks @hiyouga for your initial PR

hiyouga closed this as completed Mar 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected warning of SlowTokenizer #29237

Unexpected warning of SlowTokenizer #29237

hiyouga commented Feb 23, 2024 •

edited

Loading

ArthurZucker commented Feb 23, 2024

StevenTang1998 commented Feb 24, 2024

hiyouga commented Feb 25, 2024

github-actions bot commented Mar 25, 2024

ArthurZucker commented Mar 25, 2024 •

edited

Loading

Unexpected warning of SlowTokenizer #29237

Unexpected warning of SlowTokenizer #29237

Comments

hiyouga commented Feb 23, 2024 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

ArthurZucker commented Feb 23, 2024

StevenTang1998 commented Feb 24, 2024

hiyouga commented Feb 25, 2024

github-actions bot commented Mar 25, 2024

ArthurZucker commented Mar 25, 2024 • edited Loading

hiyouga commented Feb 23, 2024 •

edited

Loading

ArthurZucker commented Mar 25, 2024 •

edited

Loading