Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image eats up way too many tokens #2923

Open
2 of 4 tasks
aymeric-roucher opened this issue Jan 17, 2025 · 1 comment
Open
2 of 4 tasks

Image eats up way too many tokens #2923

aymeric-roucher opened this issue Jan 17, 2025 · 1 comment

Comments

@aymeric-roucher
Copy link

System Info

Using Inference Endpoint here: https://endpoints.huggingface.co/m-ric/endpoints/qwen2-72b-instruct-psj
ghcr.io/huggingface/text-generation-inference:3.0.1

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

Here's what I'm trying to run:

import base64
from openai import OpenAI
import os
from dotenv import load_dotenv
load_dotenv()

client = OpenAI(
    base_url="https://lmqbs8965pj40e01.us-east-1.aws.endpoints.huggingface.cloud/v1",
    api_key=os.getenv("HF_TOKEN")
)

with open('./screenshot.png', 'rb') as img_file:
    base64_image = base64.b64encode(img_file.read()).decode('utf-8')


client.chat.completions.create(
    model="a",
    messages=[
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "What's on this screenshot?"},
            {
                "type": "image_url",
                "image_url": {"url": f"data:image/png;base64,{base64_image}"}
            }
        ]
    }
])

The image is not big, here it is:

Image

I get this errror:

huggingface_hub.errors.HfHubHTTPError: 422 Client Error: Unprocessable Entity for url: https://lmqbs8965pj40e01.us-east-1.aws.endpoints.huggingface.cloud/v1/chat/completions (Request ID: 9kQ8on)

Input validation error: `inputs` tokens + `max_new_tokens` must be <= 32768. Given: 96721 `inputs` tokens and 0 `max_new_tokens`

It seems like my image was converted to a very large image, when original it is roughly only 1000*1000 pixels.

Expected behavior

I'd expect the uploaded image to be <1k tokens instead of ~100k tokens.

Other APIs (OpenAI, Anthropic) handle the same image fine, so I'm wondering: do they do some image size reduction pre-processing? Or is this a bug on TGI side?

@sanbindal1990
Copy link

I am also facing similar issue and it reads to me that TGI validation logic is counting tokens incorrect in case of inline images: https://github.com/huggingface/text-generation-inference/blob/main/integration-tests/conftest.py#L668

Please let me know if there is any work around to this? The images which I am passing are leading to 100k+ tokens

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants