Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: InternVL2.5 broken after migration to new multi-model input #13171

Closed
1 task done
xfalcox opened this issue Feb 12, 2025 · 4 comments
Closed
1 task done

[Bug]: InternVL2.5 broken after migration to new multi-model input #13171

xfalcox opened this issue Feb 12, 2025 · 4 comments
Labels
bug Something isn't working

Comments

@xfalcox
Copy link

xfalcox commented Feb 12, 2025

Your current environment

docker + RTX 4090

🐛 Describe the bug

When running

docker run --gpus all  -p 8080:8000 --ipc=host vllm/vllm-openai:v0.7.0 --model OpenGVLab/InternVL2_5-8B

A request like

curl http://localhost:8080/v1/chat/completions \
          -X POST \
          -H 'Content-Type: application/json'  \
          -d '{
          "model": "OpenGVLab/InternVL2_5-8B",
          "messages": [
                 {
            "role": "system",
            "content": "You are a bot specializing in image captioning."
          }, {
                  "role": "user",
                  "content": [{"type": "text", "text": "What is in this image?"}, {"type": "image_url", "image_url": {"url": "https://preview.redd.it/f58v4g8mwh551.jpg?auto=webp&s=8d987c7cfceb1feadb2925dfeabf15d050347543"}}]
              }
          ],
          "stream": false,
          "max_tokens": 2000, "temperature": 0
      }'

works.

But after last update, which includes #12553 this now happens:

docker run --gpus all -p 8080:8000 --ipc=host vllm/vllm-openai --model OpenGVLab/InternVL2_5-8B  --limit-mm-per-prompt image=2
curl http://localhost:8080/v1/chat/completions \
          -X POST \
          -H 'Content-Type: application/json'  \
          -d '{
          "model": "OpenGVLab/InternVL2_5-8B",
          "messages": [
                 {
            "role": "system",
            "content": "You are a bot specializing in image captioning."
          }, {
                  "role": "user",
                  "content": [{"type": "text", "text": "What is in this image?"}, {"type": "image_url", "image_url": {"url": "https://preview.redd.it/f58v4g8mwh551.jpg?auto=webp&s=8d987c7cfceb1feadb2925dfeabf15d050347543"}}]
              }
          ],
          "stream": false,
          "max_tokens": 2000, "temperature": 0
      }'

{"id":"chatcmpl-682ee87bb1974f3f81cdafc1a0275215","object":"chat.completion","created":1739380628,"model":"OpenGVLab/InternVL2_5-8B","choices":[{"index":0,"message":{"role":"assistant","reasoning_content":null,"content":"I can't assist with that.","tool_calls":[]},"logprobs":null,"finish_reason":"stop","stop_reason":92542}],"usage":{"prompt_tokens":3367,"total_tokens":3375,"completion_tokens":8,"prompt_tokens_details":null},"prompt_logprobs":null}

You can even see on the server logs that it failed to add the image

INFO 02-12 09:14:46 logger.py:39] Received request chatcmpl-4e2720cd75064a9cb5975b3e4b4c211a: prompt: '<s><|im_start|>system\nYou are a bot specializing in image captioning.<|im_end|>\n<|im_start|>user\n<image>\nWhat is in this image?<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.0, top_p=1.0, top_k=-1, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2000, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None), prompt_token_ids: None, lora_request: None, prompt_adapter_request: None.

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@xfalcox xfalcox added the bug Something isn't working label Feb 12, 2025
@DarkLight1337
Copy link
Member

DarkLight1337 commented Feb 12, 2025

Can you show the error log?

INFO 02-12 09:14:46 logger.py:39] Received request chatcmpl-4e2720cd75064a9cb5975b3e4b4c211a: prompt: '<s><|im_start|>system\nYou are a bot specializing in image captioning.<|im_end|>\n<|im_start|>user\n<image>\nWhat is in this image?<|im_end|>\n<|im_start|>assistant\n', params: SamplingParams(n=1, presence_penalty=0.0, frequency_penalty=0.0, repetition_penalty=1.0, temperature=0.0, top_p=1.0, top_k=-1, min_p=0.0, seed=None, stop=[], stop_token_ids=[], bad_words=[], include_stop_str_in_output=False, ignore_eos=False, max_tokens=2000, min_tokens=0, logprobs=None, prompt_logprobs=None, skip_special_tokens=True, spaces_between_special_tokens=True, truncate_prompt_tokens=None, guided_decoding=None), prompt_token_ids: None, lora_request: None, prompt_adapter_request: None.

There is an <image> token here which shows that the image is successfully included in the request.

@xfalcox
Copy link
Author

xfalcox commented Feb 12, 2025

The only output I get from the model on latest is I can't assist with that., while it works fine on v0.7.0.

@DarkLight1337
Copy link
Member

DarkLight1337 commented Feb 13, 2025

I'm unable to repro this issue using the latest code by copy and pasting your prompt into examples/offline_inference/vision_language.py.

Perhaps you need to set stop_token_ids as per this example.

@xfalcox
Copy link
Author

xfalcox commented Feb 13, 2025

Thanks @DarkLight1337.

It appears that the different way to ingest the image changed the behavior of the LLM quite a lot, so it will require adapting all the prompts I'm using.

Sorry for the false alarm.

@xfalcox xfalcox closed this as completed Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants