Releases: BerriAI/litellm
v1.66.3.dev1
What's Changed
- [Feat] Unified Responses API - Add Azure Responses API support by @ishaan-jaff in #10116
- UI: Make columns resizable/hideable in Models table by @msabramo in #10119
Full Changelog: v1.66.2.dev1...v1.66.3.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.3.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 180.0 | 210.74489628810068 | 6.401988824678471 | 0.003341330284278951 | 1916 | 1 | 38.52582800004711 | 5506.760536000002 |
Aggregated | Passed ✅ | 180.0 | 210.74489628810068 | 6.401988824678471 | 0.003341330284278951 | 1916 | 1 | 38.52582800004711 | 5506.760536000002 |
v1.66.3-nightly
What's Changed
- Add aggregate spend by tag by @krrishdholakia in #10071
- Add OpenAI o3 & o4-mini by @PeterDaveHello in #10065
- Add new
/tag/daily/activity
endpoint + Add tag dashboard to UI by @krrishdholakia in #10073 - Add team based usage dashboard at 1m+ spend logs (+ new
/team/daily/activity
API) by @krrishdholakia in #10081 - [Feat SSO] Add LiteLLM SCIM Integration for Team and User management by @ishaan-jaff in #10072
- Virtual Keys: Filter by key alias (#10035) by @ishaan-jaff in #10085
- Add new
/vertex_ai/discovery
route - enables calling AgentBuilder API routes by @krrishdholakia in #10084 - fix(o_series_transformation.py): correctly map o4 to openai o_series … by @krrishdholakia in #10079
Full Changelog: v1.66.2-nightly...v1.66.3-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.3-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 250.0 | 302.3290337319068 | 6.097097387542003 | 0.04679789661490572 | 1824 | 14 | 218.4401190000358 | 5459.562037000012 |
Aggregated | Failed ❌ | 250.0 | 302.3290337319068 | 6.097097387542003 | 0.04679789661490572 | 1824 | 14 | 218.4401190000358 | 5459.562037000012 |
v1.66.2.dev1
What's Changed
- Add aggregate spend by tag by @krrishdholakia in #10071
- Add OpenAI o3 & o4-mini by @PeterDaveHello in #10065
- Add new
/tag/daily/activity
endpoint + Add tag dashboard to UI by @krrishdholakia in #10073 - Add team based usage dashboard at 1m+ spend logs (+ new
/team/daily/activity
API) by @krrishdholakia in #10081 - [Feat SSO] Add LiteLLM SCIM Integration for Team and User management by @ishaan-jaff in #10072
- Virtual Keys: Filter by key alias (#10035) by @ishaan-jaff in #10085
- Add new
/vertex_ai/discovery
route - enables calling AgentBuilder API routes by @krrishdholakia in #10084 - fix(o_series_transformation.py): correctly map o4 to openai o_series … by @krrishdholakia in #10079
Full Changelog: v1.66.2-nightly...v1.66.2.dev1
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.2.dev1
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 200.0 | 242.7078390639904 | 6.1689738182726535 | 0.0 | 1844 | 0 | 181.44264199997906 | 6553.659710999966 |
Aggregated | Passed ✅ | 200.0 | 242.7078390639904 | 6.1689738182726535 | 0.0 | 1844 | 0 | 181.44264199997906 | 6553.659710999966 |
v1.66.2-nightly
What's Changed
- Fix azure tenant id check from env var + response_format check on api_version 2025+ by @krrishdholakia in #9993
- Add
/vllm
and/mistral
passthrough endpoints by @krrishdholakia in #10002 - CI/CD fix mock tests by @ishaan-jaff in #10003
- Setting
litellm.modify_params
via environment variables by @Eoous in #9964 - Support checking provider
/models
endpoints on proxy/v1/models
endpoint by @krrishdholakia in #9958 - Update AWS bedrock regions by @Schnitzel in #9430
- Fix case where only system messages are passed to Gemini by @NolanTrem in #9992
- Revert "Fix case where only system messages are passed to Gemini" by @krrishdholakia in #10027
- chore(docs): Update logging.md by @mrlorentx in #10006
- build(deps): bump @babel/runtime from 7.23.9 to 7.27.0 in /ui/litellm-dashboard by @dependabot in #10001
- Fix typo: Entrata -> Entra in code by @msabramo in #9922
- Retain schema field ordering for google gemini and vertex by @adrianlyjak in #9828
- Revert "Retain schema field ordering for google gemini and vertex" by @krrishdholakia in #10038
- Add aggregate team based usage logging by @krrishdholakia in #10039
- [UI Polish] UI fixes for cache control injection settings by @ishaan-jaff in #10031
- [UI] Bug Fix - Show created_at and updated_at for Users Page by @ishaan-jaff in #10033
- [Feat - Cost Tracking improvement] Track prompt caching metrics in DailyUserSpendTransactions by @ishaan-jaff in #10029
- Fix gcs pub sub logging with env var GCS_PROJECT_ID by @krrishdholakia in #10042
- Add property ordering for vertex ai schema (#9828) + Fix combining multiple tool calls by @krrishdholakia in #10040
- [Docs] Auto prompt caching by @ishaan-jaff in #10044
- Add litellm call id passing to Aim guardrails on pre and post-hooks calls by @hxmichael in #10021
- /utils/token_counter: get model_info from deployment directly by @chaofuyang in #10047
- [Bug Fix] Azure Blob Storage fixes by @ishaan-jaff in #10059
- build(deps): bump http-proxy-middleware from 2.0.7 to 2.0.9 in /docs/my-website by @dependabot in #10064
- fix(stream_chunk_builder_utils.py): don't set index on modelresponse by @krrishdholakia in #10063
- fix(llm_http_handler.py): fix fake streaming by @krrishdholakia in #10061
New Contributors
- @Eoous made their first contribution in #9964
- @mrlorentx made their first contribution in #10006
- @hxmichael made their first contribution in #10021
- @chaofuyang made their first contribution in #10047
Full Changelog: v1.66.1-nightly...v1.66.2-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.2-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 244.45508035967268 | 6.136194497326665 | 0.0 | 1835 | 0 | 169.77143499997283 | 8723.871383000016 |
Aggregated | Passed ✅ | 190.0 | 244.45508035967268 | 6.136194497326665 | 0.0 | 1835 | 0 | 169.77143499997283 | 8723.871383000016 |
v1.66.1-nightly
What's Changed
- build(model_prices_and_context_window.json): add gpt-4.1 pricing by @krrishdholakia in #9990
- [Fixes/QA] For gpt-4.1 costs by @ishaan-jaff in #9991
- Fix cost for Phi-4-multimodal output tokens by @emerzon in #9880
- chore(docs): update ordering of logging & observability docs by @marcklingen in #9994
- Updated cohere v2 passthrough by @krrishdholakia in #9997
- [Feat] Add support for
cache_control_injection_points
for Anthropic API, Bedrock API by @ishaan-jaff in #9996 - [UI] Allow setting prompt
cache_control_injection_points
by @ishaan-jaff in #10000
Full Changelog: v1.66.0-nightly...v1.66.1-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.1-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 220.0 | 243.74385918230334 | 6.268015361621096 | 0.0 | 1876 | 0 | 197.45038600001408 | 3855.600032000012 |
Aggregated | Passed ✅ | 220.0 | 243.74385918230334 | 6.268015361621096 | 0.0 | 1876 | 0 | 197.45038600001408 | 3855.600032000012 |
v1.66.0-stable
What's Changed
- build(deps): bump @babel/runtime from 7.26.0 to 7.27.0 in /docs/my-website by @dependabot in #9934
- fix: correct the cost for 'gemini/gemini-2.5-pro-preview-03-25' by @n1lanjan in #9896
- Litellm add managed files db by @krrishdholakia in #9930
- [DB / Infra] Add new column team_member_permissions by @ishaan-jaff in #9941
- fix(factory.py): correct indentation for message index increment in ollama, This fixes bug #9822 by @djshaw01 in #9943
- fix(litellm_proxy_extras): add baselining db script by @krrishdholakia in #9942
- [Team Member permissions] - Fixes by @ishaan-jaff in #9945
- Litellm managed files docs by @krrishdholakia in #9948
- [v1.66.0-stable] Release notes by @ishaan-jaff in #9952
- [Docs] v1.66.0-stable fixes by @ishaan-jaff in #9953
- stable release note fixes by @ishaan-jaff in #9954
- Fix filtering litellm-dashboard keys for internal users + prevent flooding spend logs with admin endpoint errors by @krrishdholakia in #9955
- [UI QA checklist] by @ishaan-jaff in #9957
New Contributors
Full Changelog: v1.65.8-nightly...v1.66.0-stable
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.66.0-stable
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 250.0 | 282.9933559715544 | 5.995117478456652 | 0.0 | 1793 | 0 | 223.97943800001485 | 5176.803935999998 |
Aggregated | Passed ✅ | 250.0 | 282.9933559715544 | 5.995117478456652 | 0.0 | 1793 | 0 | 223.97943800001485 | 5176.803935999998 |
v1.66.0-nightly
What's Changed
- build(deps): bump @babel/runtime from 7.26.0 to 7.27.0 in /docs/my-website by @dependabot in #9934
- fix: correct the cost for 'gemini/gemini-2.5-pro-preview-03-25' by @n1lanjan in #9896
- Litellm add managed files db by @krrishdholakia in #9930
- [DB / Infra] Add new column team_member_permissions by @ishaan-jaff in #9941
- fix(factory.py): correct indentation for message index increment in ollama, This fixes bug #9822 by @djshaw01 in #9943
- fix(litellm_proxy_extras): add baselining db script by @krrishdholakia in #9942
- [Team Member permissions] - Fixes by @ishaan-jaff in #9945
- Litellm managed files docs by @krrishdholakia in #9948
- [v1.66.0-stable] Release notes by @ishaan-jaff in #9952
- [Docs] v1.66.0-stable fixes by @ishaan-jaff in #9953
- stable release note fixes by @ishaan-jaff in #9954
- Fix filtering litellm-dashboard keys for internal users + prevent flooding spend logs with admin endpoint errors by @krrishdholakia in #9955
- [UI QA checklist] by @ishaan-jaff in #9957
New Contributors
Full Changelog: v1.65.8-nightly...v1.66.0-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.66.0-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 230.0 | 252.49209995793416 | 6.279241190720279 | 0.0 | 1878 | 0 | 200.85592700002053 | 5135.250711999987 |
Aggregated | Passed ✅ | 230.0 | 252.49209995793416 | 6.279241190720279 | 0.0 | 1878 | 0 | 200.85592700002053 | 5135.250711999987 |
v1.65.8-nightly
What's Changed
- Revert avglogprobs change + Add azure/gpt-4o-realtime-audio cost tracking by @krrishdholakia in #9893
- Realtime API: Support 'base_model' cost tracking + show response in spend logs (if enabled) by @krrishdholakia in #9897
- Simplify calling gemini models w/ file id by @krrishdholakia in #9903
- feat: add extraEnvVars to the helm deployment by @mknet3 in #9292
- [Feat - UI] - Allow setting Default Team setting when LiteLLM SSO auto creates teams by @ishaan-jaff in #9918
- Fix typo: Entrata -> Entra in docs by @msabramo in #9921
- [Feat - PR1] Add xAI grok-3 models to LiteLLM by @ishaan-jaff in #9920
- [Feat - Team Member Permissions] - CRUD Endpoints for managing team member permissions by @ishaan-jaff in #9919
- [Feat] Add litellm.supports_reasoning() util to track if an llm supports reasoning by @ishaan-jaff in #9923
- [Feat] Add reasoning_effort support for
xai/grok-3-mini-beta
model family by @ishaan-jaff in #9932 - [UI] Render Reasoning content, ttft, usage metrics on test key page by @ishaan-jaff in #9931
- [UI] - Add Managing Team Member permissions on UI by @ishaan-jaff in #9927
- [UI] Linting fixes by @ishaan-jaff in #9933
- Support CRUD endpoints for Managed Files by @krrishdholakia in #9924
- fix(databricks/common_utils.py): fix custom endpoint check by @krrishdholakia in #9925
- fix(transformation.py): correctly translate 'thinking' param for lite… by @krrishdholakia in #9904
Full Changelog: v1.65.7-nightly...v1.65.8-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.8-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 220.0 | 248.0753682003237 | 6.194614175051195 | 0.0 | 1852 | 0 | 194.34754100001328 | 4413.887686999999 |
Aggregated | Passed ✅ | 220.0 | 248.0753682003237 | 6.194614175051195 | 0.0 | 1852 | 0 | 194.34754100001328 | 4413.887686999999 |
v1.65.7-nightly
What's Changed
- [Feat SSO] - Allow admins to set
default_team_params
to have default params for when litellm SSO creates default teams by @ishaan-jaff in #9895 - [Feat] Emit Key, Team Budget metrics on a cron job schedule by @ishaan-jaff in #9528
- [Bug Fix MSFT SSO] Use correct field for user email when using MSFT SSO by @ishaan-jaff in #9886
- [Docs] Tutorial using MSFT auto team assignment with LiteLLM by @ishaan-jaff in #9898
Full Changelog: v1.65.6-nightly...v1.65.7-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.7-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 240.0 | 261.69840098746454 | 6.131387078558505 | 0.0 | 1835 | 0 | 214.285206999989 | 3626.6518760000395 |
Aggregated | Passed ✅ | 240.0 | 261.69840098746454 | 6.131387078558505 | 0.0 | 1835 | 0 | 214.285206999989 | 3626.6518760000395 |
v1.65.6-nightly
What's Changed
- Fix anthropic prompt caching cost calc + trim logged message in db by @krrishdholakia in #9838
- feat(realtime/): add token tracking + log usage object in spend logs … by @krrishdholakia in #9843
- fix(cost_calculator.py): handle custom pricing at deployment level fo… by @krrishdholakia in #9855
Full Changelog: v1.65.5-nightly...v1.65.6-nightly
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.65.6-nightly
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Passed ✅ | 190.0 | 209.99145276997868 | 6.188819872716192 | 0.0 | 1852 | 0 | 167.33176299999286 | 4428.401366999992 |
Aggregated | Passed ✅ | 190.0 | 209.99145276997868 | 6.188819872716192 | 0.0 | 1852 | 0 | 167.33176299999286 | 4428.401366999992 |