[Feature] Support llguidance for constrained decoding #3298

JC1DA · 2025-02-04T22:28:23Z

Motivation

This pull request integrates llguidance backend to extend guided decoding capabilities for sglang
llguidance backend supports regex, json and grammar (lark or ebnf)

We have just released a large JSON Schema benchmark and a paper. Of particular interest might be isolated mask-generation benchmarks - comparing LLGuidance, Outlines, XGrammar and llama.cpp grammars.

Modifications

Added llguidance constrained backend to support masked decoding & json/ebnf/regex testcases.
We have plan to further support fast-forward decoding in the near future

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.

JC1DA · 2025-02-10T05:55:29Z

Hi @Ying1123 @merrymercy @zhyncs
Can you guys help take a look at this PR?
We'd love to bring more choices for constraint decoding to sglang community :)

zhyncs · 2025-02-10T05:56:38Z

ok I'll help take a look asap. Thanks for your contribution!

JC1DA · 2025-02-10T06:16:36Z

ok I'll help take a look asap. Thanks for your contribution!

thanks @zhyncs, really appreciate it

Harsha-Nori · 2025-02-16T20:30:04Z

Hi @zhyncs, any chance we could get a brief review here? We'd like to deploy guidance + sglang for some of our users, and hopefully also deliver benefits to the sglang community! Just some pointers on things you'd like to see changed or improved would help us make sure we're working in the right spirit 🙂

zhyncs · 2025-02-16T20:34:09Z

Hi @zhyncs, any chance we could get a brief review here? We'd like to deploy guidance + sglang for some of our users, and hopefully also deliver benefits to the sglang community! Just some pointers on things you'd like to see changed or improved would help us make sure we're working in the right spirit 🙂

@Harsha-Nori Ah Sorry for the delayed response. I've been busy lately. We will review soon. Thank you for your understanding! BTW @JC1DA, can you help resolve the conflicts? Thanks!

zhaochenyang20 · 2025-02-16T22:31:00Z

@JC1DA @zhyncs sure. love to help!

zhaochenyang20 · 2025-02-16T22:31:08Z

@JC1DA Could you fix the conflicts first?

JC1DA · 2025-02-16T23:56:55Z

@JC1DA Could you fix the conflicts first?

hi @zhyncs @zhaochenyang20, fixed :) thanks for taking a look

zhaochenyang20 · 2025-02-17T17:35:39Z

@JC1DA Hey. Why should we keep a fixed x-grammar verison? We will use x-grammar as default backend in the next PR.

mmoskal · 2025-02-17T17:50:39Z

@zhaochenyang20 this PR doesn't seem to change xgrammar version used, was the comment for a different PR?

zhaochenyang20 · 2025-02-17T19:20:02Z

@mmoskal @JC1DA I see the change. Sorry for misunderstanding. cc @shuaills for review. Thanks!

JC1DA · 2025-02-17T23:20:01Z

@mmoskal @JC1DA I see the change. Sorry for misunderstanding. cc @shuaills for review. Thanks!

thanks @zhaochenyang20, just fixed the recent conflict

zhaochenyang20

Important dependies like transformers should not be fixed in this PR. Should we use a fixed transformers verison?

JC1DA · 2025-02-17T23:40:08Z

Important dependies like transformers should not be fixed in this PR. Should we use a fixed transformers verison?

I merged it from the main branch, we didn't add transformers in. We only added llguidance dependency.

@zhaochenyang20 Can you help rerun the workflows. Just reran pre-commit on pyproject.toml

mmoskal · 2025-02-25T19:05:40Z

@shuaills there is no cache; the grammars are compiled every time; it takes ~2ms on avarage on a large JSON schema test suite with p99.9 of under 40ms (single-threaded), see https://github.com/guidance-ai/jsonschemabench/tree/main/maskbench

shuaills · 2025-02-25T19:13:53Z

@shuaills there is no cache; the grammars are compiled every time; it takes ~2ms on avarage on a large JSON schema test suite with p99.9 of under 40ms (single-threaded), see https://github.com/guidance-ai/jsonschemabench/tree/main/maskbench

Why we don't need a cache here? Can you share some insights, thanks. @mmoskal

mmoskal · 2025-02-25T19:17:57Z

@shuaills there is some pre-computation, but it's per-tokenizer and let's call it grammar type (in this case JSON) not per grammar; anyway it's not very heavy

there are more details here https://github.com/guidance-ai/llguidance/blob/main/docs/optimizations.md

shuaills

LGTM @zhaochenyang20

JC1DA · 2025-02-25T19:35:52Z

Can you also rebase the pr?

Sure, I just did the rebase

zhaochenyang20 · 2025-02-26T02:44:58Z

@shuaills shuai., if you feel good. Give an approval. I will merge it today.

JC1DA · 2025-02-26T17:34:04Z

@shuaills shuai., if you feel good. Give an approval. I will merge it today.

Hi @zhaochenyang20 , are we able to merge now or is they anything I should do to help?

JC1DA requested review from merrymercy, Ying1123, zhyncs, hnyls2002, ispobock and ByronHsu as code owners February 4, 2025 22:28

zhyncs self-assigned this Feb 10, 2025

zhyncs requested review from jhinpan, minleminzui, zhaochenyang20 and shuaills February 16, 2025 20:34

zhyncs assigned jhinpan, minleminzui, zhaochenyang20 and shuaills Feb 16, 2025

shuaills mentioned this pull request Feb 17, 2025

[Feature] Update from outlines to llguidance #3627

Closed

2 tasks

zhaochenyang20 requested changes Feb 17, 2025

View reviewed changes

JC1DA added 11 commits February 25, 2025 11:08

Add llguidance support

23bd5c3

Compile regex before creating GuidanceGrammar

ed8f927

Add support for ebnf grammar

07ed65a

remove llguidance_utils

6fe4019

Update llguidance to 0.6.15

51a6ac0

Fix incorrect bitmask shape when batch size changes

35ba9d8

Add llguidance testcases

473e7e2

Use LarkCompiler to compile ebnf to grammar

c84af1b

format code

4495cc9

Update structured outputs docs

b165a8a

Update structured outputs docs

3bf08ad

JC1DA force-pushed the integrate_guidance branch from 00999da to 3bf08ad Compare February 25, 2025 19:11

Fix typos

8f97d2b

JC1DA added 2 commits February 25, 2025 11:16

add TestJumpForwardLLGuidance testcase

4029cc2

Add TODO for fast-forward tokens support

1535cb3

shuaills approved these changes Feb 25, 2025

View reviewed changes

JC1DA requested a review from zhaochenyang20 February 25, 2025 23:36

zhaochenyang20 added 2 commits February 26, 2025 09:42

Merge branch 'main' into integrate_guidance

c7e8c31

Merge branch 'main' into integrate_guidance

d4bdd96

zhaochenyang20 merged commit 7551498 into sgl-project:main Feb 26, 2025

tot0 mentioned this pull request Feb 27, 2025

Reasoning parser #3859

Closed

6 tasks

DarkSharpness mentioned this pull request Mar 3, 2025

[Fix & Style] Refactor the grammar backend to reduce human errors and improve readability #4030

Merged

6 tasks

yang-ybb pushed a commit to yang-ybb/sglang that referenced this pull request Mar 10, 2025

[Feature] Support llguidance for constrained decoding (sgl-project#3298)

52c9320

aoshen524 pushed a commit to aoshen524/sglang that referenced this pull request Mar 10, 2025

[Feature] Support llguidance for constrained decoding (sgl-project#3298)

b439fc9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Support llguidance for constrained decoding #3298

[Feature] Support llguidance for constrained decoding #3298

JC1DA commented Feb 4, 2025 •

edited

Loading

JC1DA commented Feb 10, 2025

zhyncs commented Feb 10, 2025

JC1DA commented Feb 10, 2025

Harsha-Nori commented Feb 16, 2025 •

edited

Loading

zhyncs commented Feb 16, 2025

zhaochenyang20 commented Feb 16, 2025

zhaochenyang20 commented Feb 16, 2025

JC1DA commented Feb 16, 2025 •

edited

Loading

zhaochenyang20 commented Feb 17, 2025

mmoskal commented Feb 17, 2025

zhaochenyang20 commented Feb 17, 2025 •

edited

Loading

JC1DA commented Feb 17, 2025 •

edited by zhaochenyang20

Loading

zhaochenyang20 left a comment

JC1DA commented Feb 17, 2025 •

edited

Loading

mmoskal commented Feb 25, 2025

shuaills commented Feb 25, 2025

mmoskal commented Feb 25, 2025

shuaills left a comment

JC1DA commented Feb 25, 2025 •

edited

Loading

zhaochenyang20 commented Feb 26, 2025

JC1DA commented Feb 26, 2025

[Feature] Support llguidance for constrained decoding #3298

[Feature] Support llguidance for constrained decoding #3298

Conversation

JC1DA commented Feb 4, 2025 • edited Loading

Motivation

Modifications

Checklist

JC1DA commented Feb 10, 2025

zhyncs commented Feb 10, 2025

JC1DA commented Feb 10, 2025

Harsha-Nori commented Feb 16, 2025 • edited Loading

zhyncs commented Feb 16, 2025

zhaochenyang20 commented Feb 16, 2025

zhaochenyang20 commented Feb 16, 2025

JC1DA commented Feb 16, 2025 • edited Loading

zhaochenyang20 commented Feb 17, 2025

mmoskal commented Feb 17, 2025

zhaochenyang20 commented Feb 17, 2025 • edited Loading

JC1DA commented Feb 17, 2025 • edited by zhaochenyang20 Loading

zhaochenyang20 left a comment

Choose a reason for hiding this comment

JC1DA commented Feb 17, 2025 • edited Loading

mmoskal commented Feb 25, 2025

shuaills commented Feb 25, 2025

mmoskal commented Feb 25, 2025

shuaills left a comment

Choose a reason for hiding this comment

JC1DA commented Feb 25, 2025 • edited Loading

zhaochenyang20 commented Feb 26, 2025

JC1DA commented Feb 26, 2025

JC1DA commented Feb 4, 2025 •

edited

Loading

Harsha-Nori commented Feb 16, 2025 •

edited

Loading

JC1DA commented Feb 16, 2025 •

edited

Loading

zhaochenyang20 commented Feb 17, 2025 •

edited

Loading

JC1DA commented Feb 17, 2025 •

edited by zhaochenyang20

Loading

JC1DA commented Feb 17, 2025 •

edited

Loading

JC1DA commented Feb 25, 2025 •

edited

Loading