feat: improve qwen2-vl startup #2802

drbh · 2024-12-05T16:12:19Z

This PR resolves some small issues with qwen2-vl.

doubles the size of WARMUP_IMAGE_BASE64 from 20x20px to 40x40px (meets qwens minimal requirement without hacky fix)
removes hacky fix to double the warmup image
prefer tokenizing each request instead of the whole batch at once. this change allows r.truncate to be passed for each request - as previouslly it was not respected when one of the request was smaller than others in the batch.
sets max_s to the max of max_s or the input size. This is required so the rotary and create self._cos_cached of the correct size in relation to the position ids.

these changes resolve a startup issue reproducible with:

text-generation-launcher \
--model-id Qwen/Qwen2-VL-2B-Instruct \
--max-input-tokens 40 \
--max-batch-prefill-tokens 50 \
--max-total-tokens 51

*(note the underlying issue triggers when max-input-tokens is less than max-batch-prefill-tokens)

Narsil · 2024-12-09T18:47:19Z

from 20x20px to 40x40px (meets qwens minimal requirement without hacky fix)

I do not understand, why should we impose anything on the user for the images. If 20px x 20x is not supported we should:

Rescale the image seemlessly and correctly infer on it.
Reject the image with a proper error message.
User's shouldn't have to know anything about the model's internals, 20x20px should be ok imho.

drbh · 2025-01-17T16:09:56Z

optimistically merging this PR as all tests pass, comments have been addressed, this image has been test/deployed in production and it fixes a bug when starting TGI with qwen2-vl.

Will watch for regressions and roll back if needed

This reverts commit eecca27.

Revert "feat: improve qwen2-vl startup (#2802)" This reverts commit eecca27.

drbh force-pushed the improve-qwen2-vl-warmup branch from 066addd to 60b9c18 Compare December 6, 2024 17:40

drbh mentioned this pull request Dec 9, 2024

Attempt for cleverer auto batch_prefill values (some simplifications). #2808

Merged

5 tasks

drbh force-pushed the improve-qwen2-vl-warmup branch 2 times, most recently from 32a9564 to a3049f1 Compare December 9, 2024 21:32

drbh requested a review from Narsil December 16, 2024 15:54

drbh force-pushed the improve-qwen2-vl-warmup branch from a3049f1 to d671f6e Compare January 7, 2025 22:35

drbh mentioned this pull request Jan 7, 2025

Qwen2-VL failed to infer multiple images (Server error: upper bound and larger bound inconsistent with step sign) #2888

Open

4 tasks

drbh changed the title ~~feat: tokenize each request individually and increase warmup image size~~ feat: improve qwen2-vl startup Jan 8, 2025

drbh force-pushed the improve-qwen2-vl-warmup branch from d671f6e to 320b520 Compare January 13, 2025 18:50

drbh added 4 commits January 16, 2025 11:03

feat: tokenize each request individually and increase warmup image size

1bcfba3

feat: adjust rotary embed and avoid cuda graphs of size 2 and smaller

45e5c2c

fix: address image resize and rebase changes

822bd04

feat: update to run qwen2-vl tests

bd59f96

drbh force-pushed the improve-qwen2-vl-warmup branch from 35b528e to bd59f96 Compare January 16, 2025 16:04

fix: tweak param types

37f92f2

drbh merged commit eecca27 into main Jan 17, 2025
14 checks passed

drbh deleted the improve-qwen2-vl-warmup branch January 17, 2025 16:50

drbh added a commit that referenced this pull request Jan 17, 2025

Revert "feat: improve qwen2-vl startup (#2802)"

009a95a

This reverts commit eecca27.

drbh mentioned this pull request Jan 17, 2025

Revert "feat: improve qwen2-vl startup " #2924

Merged

drbh added a commit that referenced this pull request Jan 17, 2025

Revert "feat: improve qwen2-vl startup " (#2924)

8f6146f

Revert "feat: improve qwen2-vl startup (#2802)" This reverts commit eecca27.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: improve qwen2-vl startup #2802

feat: improve qwen2-vl startup #2802

drbh commented Dec 5, 2024

Narsil commented Dec 9, 2024

drbh commented Jan 17, 2025

feat: improve qwen2-vl startup #2802

feat: improve qwen2-vl startup #2802

Conversation

drbh commented Dec 5, 2024

Narsil commented Dec 9, 2024

drbh commented Jan 17, 2025