Skip to content

[Feature] Support llguidance for constrained decoding #3298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Feb 26, 2025

Conversation

JC1DA
Copy link
Contributor

@JC1DA JC1DA commented Feb 4, 2025

Motivation

This pull request integrates llguidance backend to extend guided decoding capabilities for sglang
llguidance backend supports regex, json and grammar (lark or ebnf)

We have just released a large JSON Schema benchmark and a paper. Of particular interest might be isolated mask-generation benchmarks - comparing LLGuidance, Outlines, XGrammar and llama.cpp grammars.

image

Modifications

  • Added llguidance constrained backend to support masked decoding & json/ebnf/regex testcases.
  • We have plan to further support fast-forward decoding in the near future

Checklist

@JC1DA
Copy link
Contributor Author

JC1DA commented Feb 10, 2025

Hi @Ying1123 @merrymercy @zhyncs
Can you guys help take a look at this PR?
We'd love to bring more choices for constraint decoding to sglang community :)

@zhyncs
Copy link
Member

zhyncs commented Feb 10, 2025

ok I'll help take a look asap. Thanks for your contribution!

@zhyncs zhyncs self-assigned this Feb 10, 2025
@JC1DA
Copy link
Contributor Author

JC1DA commented Feb 10, 2025

ok I'll help take a look asap. Thanks for your contribution!

thanks @zhyncs, really appreciate it

@Harsha-Nori
Copy link

Harsha-Nori commented Feb 16, 2025

Hi @zhyncs, any chance we could get a brief review here? We'd like to deploy guidance + sglang for some of our users, and hopefully also deliver benefits to the sglang community! Just some pointers on things you'd like to see changed or improved would help us make sure we're working in the right spirit 🙂

@zhyncs
Copy link
Member

zhyncs commented Feb 16, 2025

Hi @zhyncs, any chance we could get a brief review here? We'd like to deploy guidance + sglang for some of our users, and hopefully also deliver benefits to the sglang community! Just some pointers on things you'd like to see changed or improved would help us make sure we're working in the right spirit 🙂

@Harsha-Nori Ah Sorry for the delayed response. I've been busy lately. We will review soon. Thank you for your understanding! BTW @JC1DA, can you help resolve the conflicts? Thanks!

@zhaochenyang20
Copy link
Collaborator

@JC1DA @zhyncs sure. love to help!

@zhaochenyang20
Copy link
Collaborator

@JC1DA Could you fix the conflicts first?

@JC1DA
Copy link
Contributor Author

JC1DA commented Feb 16, 2025

@JC1DA Could you fix the conflicts first?

hi @zhyncs @zhaochenyang20, fixed :) thanks for taking a look

@zhaochenyang20
Copy link
Collaborator

@JC1DA Hey. Why should we keep a fixed x-grammar verison? We will use x-grammar as default backend in the next PR.

@mmoskal
Copy link
Contributor

mmoskal commented Feb 17, 2025

@zhaochenyang20 this PR doesn't seem to change xgrammar version used, was the comment for a different PR?

@zhaochenyang20
Copy link
Collaborator

zhaochenyang20 commented Feb 17, 2025

@mmoskal @JC1DA I see the change. Sorry for misunderstanding. cc @shuaills for review. Thanks!

@JC1DA
Copy link
Contributor Author

JC1DA commented Feb 17, 2025

@mmoskal @JC1DA I see the change. Sorry for misunderstanding. cc @shuaills for review. Thanks!

thanks @zhaochenyang20, just fixed the recent conflict

Copy link
Collaborator

@zhaochenyang20 zhaochenyang20 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important dependies like transformers should not be fixed in this PR. Should we use a fixed transformers verison?

@JC1DA
Copy link
Contributor Author

JC1DA commented Feb 17, 2025

Important dependies like transformers should not be fixed in this PR. Should we use a fixed transformers verison?

I merged it from the main branch, we didn't add transformers in. We only added llguidance dependency.

@zhaochenyang20 Can you help rerun the workflows. Just reran pre-commit on pyproject.toml

@mmoskal
Copy link
Contributor

mmoskal commented Feb 25, 2025

@shuaills there is no cache; the grammars are compiled every time; it takes ~2ms on avarage on a large JSON schema test suite with p99.9 of under 40ms (single-threaded), see https://github.com/guidance-ai/jsonschemabench/tree/main/maskbench

@JC1DA JC1DA force-pushed the integrate_guidance branch from 00999da to 3bf08ad Compare February 25, 2025 19:11
@shuaills
Copy link
Collaborator

@shuaills there is no cache; the grammars are compiled every time; it takes ~2ms on avarage on a large JSON schema test suite with p99.9 of under 40ms (single-threaded), see https://github.com/guidance-ai/jsonschemabench/tree/main/maskbench

Why we don't need a cache here? Can you share some insights, thanks. @mmoskal

@mmoskal
Copy link
Contributor

mmoskal commented Feb 25, 2025

@shuaills there is some pre-computation, but it's per-tokenizer and let's call it grammar type (in this case JSON) not per grammar; anyway it's not very heavy

there are more details here https://github.com/guidance-ai/llguidance/blob/main/docs/optimizations.md

Copy link
Collaborator

@shuaills shuaills left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JC1DA
Copy link
Contributor Author

JC1DA commented Feb 25, 2025

Can you also rebase the pr?

Sure, I just did the rebase

@JC1DA JC1DA requested a review from zhaochenyang20 February 25, 2025 23:36
@zhaochenyang20
Copy link
Collaborator

@shuaills shuai., if you feel good. Give an approval. I will merge it today.

@JC1DA
Copy link
Contributor Author

JC1DA commented Feb 26, 2025

@shuaills shuai., if you feel good. Give an approval. I will merge it today.

Hi @zhaochenyang20 , are we able to merge now or is they anything I should do to help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants