-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Invalid token/cost calculation #8514
Comments
Hi @mboret what do your spend logs show for a bedrock call made via roocode? And can you try bumping and see if that fixes it? cc: @ishaan-jaff |
Going to begin work on this. |
After doing some tests, I'm more confused... I've created a local test (docker container) with claude-3.5-sonnet-v2 and configured my Roo Code to use it. I've tested a simple question and file edition. The calculated prices look... correct. I'm using the same Litellm version as in production. In production, my deployment is in a k8s cluster, 2 replicas, a Redis pod and an RDS PostgreSQL database. One more thing, in production we are also connected to Azure to use OpenAI models. The reported costs are the same between Litellm and Azure (mainly GPT-4o is used but by other applications, not Roo Code) |
Hey @mboret what do your spend logs for bedrock say in prod vs. the value received clientside? |
@krrishdholakia how do I get the spend log without all the other logs? Because we have multiple requests in production and if I need to set the verbose mode to True... too many logs are generated |
Hey @mboret this is an api endpoint - https://litellm-api.up.railway.app/#/Budget%20%26%20Spend%20Tracking/view_spend_logs_spend_logs_get you can also check this on your UI (in more recent versions) - ui -> logs tab |
Ah yes, sorry. So the output:
(I don't have anymore the initial virtual key. But it's the same configuration for this one, and I still have the incorrect cost after requesting Claude Sonnet v2) |
that's the i'm asking you for the spend log for that call via openwebui so we can validate what the litellm values are |
What happened?
Hi,
I've discovered a complete inconsistency between the cost reported by litellm and AWS Bedrock using the Claude 3.5 Sonnet V2 model. Based on my tests, the cost definition is correct: when performing a standard request (chat completion) and analyzing the request-response, the prompt token count looks coherent and the related price is correct according to AWS pricing.
My issue seems to occur when using Roo Code. I think the input token count from litellm is incorrect.
Details of my test (with a new virtual key):
x-litellm-response-cost: 0.018414
completion_tokens: 1224
prompt_tokens: 18
In the UI: 0.018
Roo Code Input Tokens: 56.1k
Roo Code Output Tokens: 4.9k
Roo Code Cost: $0.2418
litellm UI (total cost for my virtual key): 0.0915 (less my previous request cost - 0.018) => $0.0735
I don't know what Roo Code is doing, but I suppose the issue is related to the litellm token count level.
For the current month, litellm is reporting $26.74 for the Claude 3.5 Sonnet V2, whereas AWS shows $770... I'm pretty sure the issue is related to Roo Code, as previously no one was using this model. But now, multiple users have been utilizing it since they started using Roo Code.
Relevant log output
Are you a ML Ops Team?
No
What LiteLLM version are you on ?
v1.59.9
Twitter / LinkedIn details
No response
The text was updated successfully, but these errors were encountered: