Releases: opendatahub-io/vllm-tgis-adapter
Releases · opendatahub-io/vllm-tgis-adapter
0.2.4
Highlights
- Compatibility with vLLM v0.5.4
- Bump minimum vLLM requirement to v0.5.4
What's Changed
- remove dead code (vllm<=0.5.0.post1) by @dtrifiro in #65
- build(deps): bump ruff from 0.5.4 to 0.5.5 by @dependabot in #72
- pyproject: fix broken URLs by @dtrifiro in #76
- gha: add missing build dependencies by @dtrifiro in #68
- noxfile: make overridden vllm version install verbose by @dtrifiro in #78
- updates for vLLM==0.5.4 by @dtrifiro in #82
- fix merge_async_iterators usage for vLLM>0.5.4 by @dtrifiro in #86
- build(deps): bump hf-transfer from 0.1.6 to 0.1.8 by @dependabot in #73
- build(deps): bump flash-attn from 2.6.1 to 2.6.3 by @dependabot in #71
- pre-commit: bump deps by @dtrifiro in #67
- Extract request's trace context in
GenerateStream()
by @ronensc in #64 - Make ADD_SPECIAL_TOKENS true by default by @maxdebayser in #66
- build(deps): bump pytest from 8.2.2 to 8.3.2 by @dependabot in #70
- Fix
stop_reason
for secondary eos tokens by @njhill in #75 - ♻️ refactor LoRARequest based on upstream by @prashantgupta24 in #74
- build(deps): bump mypy from 1.10.1 to 1.11.1 by @dependabot in #77
- Fix length penalty logit processor getting NaN by @wallashss in #85
- Re-enable seed for pipeline parallel by @njhill in #69
- 🥅 Kill servers on engine death by @joerunde in #63
New Contributors
- @ronensc made their first contribution in #64
- @maxdebayser made their first contribution in #66
- @wallashss made their first contribution in #85
Full Changelog: 0.2.3...0.2.4
0.2.3
What's Changed
- build(deps): bump accelerate from 0.31.0 to 0.32.1 by @dependabot in #42
- build(deps): bump flash-attn from 2.5.9.post1 to 2.6.1 by @dependabot in #45
- Fix earlier LoRA tokenizer changes by @njhill in #53
- build(deps): bump ruff from 0.5.1 to 0.5.2 by @dependabot in #43
- build(deps): bump types-requests from 2.32.0.20240602 to 2.32.0.20240712 by @dependabot in #44
- Fix tokenize endpoint by @njhill in #54
- build(deps): bump ruff from 0.5.2 to 0.5.4 by @dependabot in #58
- vLLM 5.3+ support by @joerunde in #60
New Contributors
Full Changelog: 0.2.2...0.2.3
0.2.2
0.2.1
0.2.0
What's Changed
- improve custom vllm version testing by @dtrifiro in #33
- build(deps): bump mypy from 1.10.0 to 1.10.1 by @dependabot in #27
- build(deps): bump ruff from 0.4.9 to 0.5.1 by @dependabot in #34
- add support for prompt adapters by @prashantgupta24 in #21
- Add mapping for TGIS speculator args by @njhill in #39
New Contributors
Full Changelog: 0.1.3...0.2.0
0.1.3
0.1.2
What's Changed
- Apply inferface change to duplicated code in the tgis layer by @prashantgupta24 in #16
- gha: add
merge_group
trigger by @dtrifiro in #18 - add tokenization with truncation, offset support (IBM #47) by @prashantgupta24 in #19
- Respect trace headers in grpc server (IBM #49) by @prashantgupta24 in #20
- build(deps): bump flash-attn from 2.5.3 to 2.5.9.post1 by @dependabot in #1
- build(deps): bump grpcio from 1.62.1 to 1.64.1 by @dependabot in #2
- build(deps): bump pytest from 8.2.0 to 8.2.2 by @dependabot in #3
- build(deps): bump accelerate from 0.28.0 to 0.31.0 by @dependabot in #17
- build(deps): bump grpcio-reflection from 1.62.1 to 1.64.1 by @dependabot in #4
Full Changelog: 0.1.1...0.1.2
0.1.1
0.1.0
What's Changed
- adapt for odh by @dtrifiro in #6
- build(deps): bump ruff from 0.4.3 to 0.4.9 by @dependabot in #8
- ✨ add ability to run REST with grpc server by @prashantgupta24 in #9
- concurrent http and grpc by @dtrifiro in #12
- integrate IBM vLLM changes by @dtrifiro in #14
New Contributors
- @dtrifiro made their first contribution in #6
- @dependabot made their first contribution in #8
- @prashantgupta24 made their first contribution in #9
Full Changelog: https://github.com/opendatahub-io/vllm-tgis-adapter/commits/0.1.0