huggingface / text-generation-inference Public

Notifications You must be signed in to change notification settings
Fork 1.1k
Star 9.6k

Code
Issues 186
Pull requests 13
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: huggingface/text-generation-inference

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

186 Open 1,245 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

CUDA Out of memory when using the benchmarking tool with batch size greater than 1

#2952 opened Jan 24, 2025 by mborisov-bi

3 of 4 tasks

Serverless Inference API OpenAI /v1/chat/completions route broken

#2946 opened Jan 23, 2025 by pelikhan

1 of 4 tasks

RuntimeError: Cannot load 'awq' weight when running Qwen2-VL-72B-Instruct-AWQ model

#2944 opened Jan 23, 2025 by edesalve

2 of 4 tasks

Allow specifying adapter_id on chat/completions requests

#2939 opened Jan 22, 2025 by tsvisab

Remove Conda from Docker Installation

#2934 opened Jan 21, 2025 by rjmehta1993

2 of 4 tasks

text-generation-inference:3.0.1 docker container timeout on image fetching from fastapi static files.

#2930 opened Jan 21, 2025 by dinoelT

2 of 4 tasks

Mangled generation for string sequences containing<space>'m with Llama 3.1

#2927 opened Jan 20, 2025 by tomjorquera

1 of 4 tasks

AttributeError: no attribute 'model' when using llava-next with lora-adapters

#2926 opened Jan 20, 2025 by derkleinejakob

2 of 4 tasks

Image eats up way too many tokens

#2923 opened Jan 17, 2025 by aymeric-roucher

2 of 4 tasks

Does tgi support image resize for qwen2-vl pipeline?

#2920 opened Jan 16, 2025 by AHEADer

1 of 4 tasks

CUDA: an illegal memory access was encountered with Mistral FP8 Marlin kernels on NVIDIA driver 535.216.01 (AWS Sagemaker Real-time Inference)

#2915 opened Jan 15, 2025 by dwyatte

3 of 4 tasks

chat API doesn't support/respect n parameter

#2914 opened Jan 15, 2025 by zfang

2 of 4 tasks

Apply rope scaling from the config.json

#2909 opened Jan 15, 2025 by rjmehta1993

Support for openbmb/MiniCPM-o-2_6

#2906 opened Jan 15, 2025 by myoss

1 of 2 tasks

Slow when using response format with JSON schemas with 8+ optional properties

#2902 opened Jan 11, 2025 by TwirreM

2 of 4 tasks

Support XGrammar backend as an alternative to Outlines

#2900 opened Jan 10, 2025 by 2016bgeyer

Support reponse_format: {"type": "json_object"} without any constrained schema

#2899 opened Jan 10, 2025 by lhoestq

Automatic Calculation of Sequence Length in TGI v3 Leads to Unrealistic Values Before CUDA OOM

#2897 opened Jan 10, 2025 by biba10

2 of 4 tasks

Prefill operation can be significantly slower in TGI v3 vs TGI v2

#2896 opened Jan 10, 2025 by biba10

2 of 4 tasks

Deepseek v3 support

#2895 opened Jan 9, 2025 by ferrybaltimore

[Qwen/Qwen2.5-14B-Instruct-GPTQ-Int8] Bad Responses with High Concurrent Requests

#2894 opened Jan 9, 2025 by michaelact

4 tasks done

make install-server does not have Apple MacOS Metal Framework

#2890 opened Jan 8, 2025 by qdrddr

2 of 4 tasks

summarization using fine-tuned flan-t5 model in TGI outputs "generated text" instead of "summary_text" and outputs are completely different

#2889 opened Jan 7, 2025 by maiiabocharova

2 of 4 tasks

Qwen2-VL failed to infer multiple images (Server error: upper bound and larger bound inconsistent with step sign)

#2888 opened Jan 7, 2025 by AHEADer

2 of 4 tasks

Unclear Metrics list

#2887 opened Jan 7, 2025 by vitalyshalumov

2 of 4 tasks

Previous 1 2 3 4 5 6 7 8 Next

Previous Next

ProTip! Mix and match filters to narrow down what you’re looking for.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly