[Frontend] Support suffix in completions API (fill-in-the-middle, FIM) #9522

njhill · 2024-10-19T00:54:02Z

For code models that have been trained with an infilling task.

Handle model-specific FIM encoding rules in a similar way to how we're handling different tool parsers.

For example, can be enabled for for Codestral by adding command line arg --fim mistral (as well as --tokenizer-mode mistral in this case).

njhill · 2024-10-19T04:28:11Z

Planning to simplify this, would be better for it to work more like chat templates than tool parsers.

mgoin · 2024-10-20T14:07:48Z

This is exciting! Looking forward to review

Handle model-specific FIM encoding rules in a similar way to how we're handling different tool parsers.

njhill · 2024-10-23T22:59:08Z

FYI @patrickvonplaten if you'd like to check the Mistral part. Is there a small Mistral model on the HF hub that we could use for this in tests?

patrickvonplaten · 2024-10-24T10:51:55Z

I guess codestral might be too big? https://huggingface.co/mistralai/Codestral-22B-v0.1

The 7B-v0.3 should also work reasonably well for FIM though: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3

njhill · 2024-10-24T23:40:53Z

Thanks @patrickvonplaten, yes I was hoping for something smaller than codestral 22B ... will try with the 7B thanks!

# Conflicts: # vllm/entrypoints/openai/api_server.py # vllm/entrypoints/openai/serving_completion.py # vllm/entrypoints/openai/serving_embedding.py # vllm/entrypoints/openai/serving_engine.py

mergify · 2024-11-15T00:45:02Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @njhill.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

thomasbs17 · 2025-02-06T20:45:26Z

Hi, this would be a great addition to continue.dev.
I was wondering if there was any update on reviewing/merging this PR?

gadikar-deshaw · 2025-02-12T16:13:10Z

Hi, following this as well.
This feature would be a great addition. Is there any plan to merge this PR?

Thanks!

njhill · 2025-02-12T16:26:13Z

@thomasbs17 @gadikar-deshaw apologies, I've been diverted from this by other priorities but will try to resurrect it when I can.

Any help would be welcome - apart from bringing the existing changes up to date I think the only thing remaining was adding tests.

thomasbs17 · 2025-02-12T19:03:48Z

@thomasbs17 @gadikar-deshaw apologies, I've been diverted from this by other priorities but will try to resurrect it when I can.

Any help would be welcome - apart from bringing the existing changes up to date I think the only thing remaining was adding tests.

Sounds good, thanks! I'll try to look into adding the tests

CZH-THU · 2025-02-13T03:27:33Z

Hi, when this PR can be merged into main?

thomasbs17 · 2025-02-15T18:37:40Z

@njhill I've worked on bringing your branch up to speed with main. PR is here

njhill · 2025-02-17T21:06:51Z

Thank you @thomasbs17! I'll try to look at this today.

Co-authored-by: Thomas Bouamoud <[email protected]>

Signed-off-by: Nick Hill <[email protected]>

njhill · 2025-02-18T17:42:46Z

Thanks @thomasbs17! I have merged your main-merge, and another commit to fix the linting errors.

Signed-off-by: Nick Hill <[email protected]>

[Frontend] Support suffix in completions API (fill-in-the-middle)

f57746a

Handle model-specific FIM encoding rules in a similar way to how we're handling different tool parsers.

njhill force-pushed the fim branch 2 times, most recently from 0b1e762 to 37135ae Compare October 22, 2024 00:10

vllm-project deleted a comment from github-actions bot Oct 22, 2024

njhill force-pushed the fim branch 2 times, most recently from c4886c3 to 4a2e7e9 Compare October 22, 2024 00:55

Simplify registration

72b2eb1

njhill force-pushed the fim branch from 4a2e7e9 to 72b2eb1 Compare October 22, 2024 00:56

njhill added 2 commits October 22, 2024 12:10

Fixes

2cd52d6

Mistral fixes

ad877c0

njhill mentioned this pull request Oct 22, 2024

Support suffix for fill-in-the-middle opendatahub-io/vllm-tgis-adapter#172

Draft

cjackal mentioned this pull request Nov 4, 2024

[Feature]: do you plan to support "suffix" of "v1/completions" #9976

Open

1 task

Merge remote-tracking branch 'refs/remotes/origin/main' into fim

9078416

# Conflicts: # vllm/entrypoints/openai/api_server.py # vllm/entrypoints/openai/serving_completion.py # vllm/entrypoints/openai/serving_embedding.py # vllm/entrypoints/openai/serving_engine.py

mergify bot added the frontend label Nov 5, 2024

mergify bot added the needs-rebase label Nov 15, 2024

thomasbs17 added 2 commits February 15, 2025 18:21

merging with main / resolving conflicts

910dccc

static analysis warning

64eacab

mgoin changed the title ~~[Frontend] Support suffix in completions API (fill-in-the-middle)~~ [Frontend] Support suffix in completions API (fill-in-the-middle, FIM) Feb 17, 2025

Merge with main from thomasbs17

99b3de7

Co-authored-by: Thomas Bouamoud <[email protected]>

mergify bot removed the needs-rebase label Feb 18, 2025

Fix linting

24c7d0f

Signed-off-by: Nick Hill <[email protected]>

Fix mypy

d5d97d8

Signed-off-by: Nick Hill <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Frontend] Support suffix in completions API (fill-in-the-middle, FIM) #9522

[Frontend] Support suffix in completions API (fill-in-the-middle, FIM) #9522

njhill commented Oct 19, 2024 •

edited by github-actions bot

Loading

njhill commented Oct 19, 2024

mgoin commented Oct 20, 2024

njhill commented Oct 23, 2024

patrickvonplaten commented Oct 24, 2024

njhill commented Oct 24, 2024

mergify bot commented Nov 15, 2024

thomasbs17 commented Feb 6, 2025

gadikar-deshaw commented Feb 12, 2025

njhill commented Feb 12, 2025

thomasbs17 commented Feb 12, 2025

CZH-THU commented Feb 13, 2025 •

edited

Loading

thomasbs17 commented Feb 15, 2025

njhill commented Feb 17, 2025

njhill commented Feb 18, 2025

[Frontend] Support suffix in completions API (fill-in-the-middle, FIM) #9522

Are you sure you want to change the base?

[Frontend] Support suffix in completions API (fill-in-the-middle, FIM) #9522

Conversation

njhill commented Oct 19, 2024 • edited by github-actions bot Loading

njhill commented Oct 19, 2024

mgoin commented Oct 20, 2024

njhill commented Oct 23, 2024

patrickvonplaten commented Oct 24, 2024

njhill commented Oct 24, 2024

mergify bot commented Nov 15, 2024

thomasbs17 commented Feb 6, 2025

gadikar-deshaw commented Feb 12, 2025

njhill commented Feb 12, 2025

thomasbs17 commented Feb 12, 2025

CZH-THU commented Feb 13, 2025 • edited Loading

thomasbs17 commented Feb 15, 2025

njhill commented Feb 17, 2025

njhill commented Feb 18, 2025

njhill commented Oct 19, 2024 •

edited by github-actions bot

Loading

CZH-THU commented Feb 13, 2025 •

edited

Loading