Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cached / multimodal token are not passed through to Langfuse #8515

Open
hassiebp opened this issue Feb 13, 2025 · 0 comments
Open

Cached / multimodal token are not passed through to Langfuse #8515

hassiebp opened this issue Feb 13, 2025 · 0 comments
Labels
bug Something isn't working feb 2025 langfuse

Comments

@hassiebp
Copy link

What happened?

Context

Langfuse has shipped V3 cost tracking that allows tracking usage details and cost details by arbitrary usage keys beyond justinput, output, and total. If OpenAI returns prompt_tokens_details.cached_tokens then Langfuse can infer costs when the cached_tokens are provided in usage_details.input_cached_tokens for the Generation. See docs here

Issue
LiteLLM is currently using the soon-to-be-deprecated usage key and not the new usage_details and cost_details map. See here.

Desired behavior
Langfuse accepts the OpenAI schema server side and via its SDK interface for usage_details and cost_details. Langfuse flattens the prompt_tokens_details and completion_tokens_details provided by OpenAI by prefixing the keys with input_ and output_ respectively. Please pass the OpenAI usage object as generation.usage_details and for cost_details respectively.

Here's the Pydantic schema enforced by Langfuse for the OpenAI Usage schema

Relevant log output

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.60.4

Twitter / LinkedIn details

https://www.linkedin.com/in/hassieb/

@hassiebp hassiebp added the bug Something isn't working label Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feb 2025 langfuse
Projects
None yet
Development

No branches or pull requests

2 participants