[Bug]: Invalid token/cost calculation #8514

mboret · 2025-02-13T10:24:08Z

What happened?

Hi,

I've discovered a complete inconsistency between the cost reported by litellm and AWS Bedrock using the Claude 3.5 Sonnet V2 model. Based on my tests, the cost definition is correct: when performing a standard request (chat completion) and analyzing the request-response, the prompt token count looks coherent and the related price is correct according to AWS pricing.

My issue seems to occur when using Roo Code. I think the input token count from litellm is incorrect.

Details of my test (with a new virtual key):

Standard request (e.g., simple question "Explain to me in detail how decorators work in Python")
x-litellm-response-cost: 0.018414
completion_tokens: 1224
prompt_tokens: 18

In the UI: 0.018

With Roo Code, I asked it to improve a file
Roo Code Input Tokens: 56.1k
Roo Code Output Tokens: 4.9k
Roo Code Cost: $0.2418

litellm UI (total cost for my virtual key): 0.0915 (less my previous request cost - 0.018) => $0.0735

I don't know what Roo Code is doing, but I suppose the issue is related to the litellm token count level.

For the current month, litellm is reporting $26.74 for the Claude 3.5 Sonnet V2, whereas AWS shows $770... I'm pretty sure the issue is related to Roo Code, as previously no one was using this model. But now, multiple users have been utilizing it since they started using Roo Code.

Relevant log output

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.59.9

Twitter / LinkedIn details

No response

krrishdholakia · 2025-02-14T06:39:51Z

Hi @mboret what do your spend logs show for a bedrock call made via roocode?

And can you try bumping and see if that fixes it?

cc: @ishaan-jaff

PradyMagal · 2025-02-18T03:30:09Z

Going to begin work on this.

mboret · 2025-02-19T15:05:06Z

After doing some tests, I'm more confused...

I've created a local test (docker container) with claude-3.5-sonnet-v2 and configured my Roo Code to use it. I've tested a simple question and file edition. The calculated prices look... correct. I'm using the same Litellm version as in production.

In production, my deployment is in a k8s cluster, 2 replicas, a Redis pod and an RDS PostgreSQL database.

One more thing, in production we are also connected to Azure to use OpenAI models. The reported costs are the same between Litellm and Azure (mainly GPT-4o is used but by other applications, not Roo Code)

krrishdholakia · 2025-02-19T15:18:02Z

Hey @mboret what do your spend logs for bedrock say in prod vs. the value received clientside?

mboret · 2025-02-19T15:58:00Z

@krrishdholakia how do I get the spend log without all the other logs? Because we have multiple requests in production and if I need to set the verbose mode to True... too many logs are generated

krrishdholakia · 2025-02-19T16:47:27Z

Hey @mboret this is an api endpoint - https://litellm-api.up.railway.app/#/Budget%20%26%20Spend%20Tracking/view_spend_logs_spend_logs_get

you can also check this on your UI (in more recent versions) - ui -> logs tab

mboret · 2025-02-19T17:01:08Z

Ah yes, sorry. So the output:

{
  "key": "sk-...MgeQ",
  "info": {
    "key_name": "sk-...MgeQ",
    "key_alias": "test-billing2",
    "soft_budget_cooldown": false,
    "spend": 0.07978800000000001,
    "expires": null,
    "models": [
      "all-team-models"
    ],
    "aliases": {},
    "config": {},
    "user_id": null,
    "team_id": "1106dfa3-5f96-4f52-b8b7-c766e307508f",
    "permissions": {},
    "max_parallel_requests": null,
    "metadata": {},
    "blocked": null,
    "tpm_limit": 10,
    "rpm_limit": 2000,
    "max_budget": 100,
    "budget_duration": "30d",
    "budget_reset_at": "2025-03-21T14:48:58.985000+00:00",
    "allowed_cache_controls": [],
    "model_spend": {},
    "model_max_budget": {},
    "budget_id": null,
    "created_at": "2025-02-19T14:48:58.988000+00:00",
    "updated_at": "2025-02-19T15:23:03.626000+00:00",
    "litellm_budget_table": null
  }
}

(I don't have anymore the initial virtual key. But it's the same configuration for this one, and I still have the incorrect cost after requesting Claude Sonnet v2)

krrishdholakia · 2025-02-19T18:41:33Z

that's the /key/info

i'm asking you for the spend log for that call via openwebui

so we can validate what the litellm values are

mboret added the bug Something isn't working label Feb 13, 2025

krrishdholakia assigned krrishdholakia and ishaan-jaff and unassigned krrishdholakia Feb 14, 2025

krrishdholakia added spend tracking feb 2025 labels Feb 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: Invalid token/cost calculation #8514

[Bug]: Invalid token/cost calculation #8514

mboret commented Feb 13, 2025

krrishdholakia commented Feb 14, 2025

PradyMagal commented Feb 18, 2025

mboret commented Feb 19, 2025

krrishdholakia commented Feb 19, 2025

mboret commented Feb 19, 2025

krrishdholakia commented Feb 19, 2025

mboret commented Feb 19, 2025

krrishdholakia commented Feb 19, 2025

[Bug]: Invalid token/cost calculation #8514

[Bug]: Invalid token/cost calculation #8514

Comments

mboret commented Feb 13, 2025

What happened?

Relevant log output

Are you a ML Ops Team?

What LiteLLM version are you on ?

Twitter / LinkedIn details

krrishdholakia commented Feb 14, 2025

PradyMagal commented Feb 18, 2025

mboret commented Feb 19, 2025

krrishdholakia commented Feb 19, 2025

mboret commented Feb 19, 2025

krrishdholakia commented Feb 19, 2025

mboret commented Feb 19, 2025

krrishdholakia commented Feb 19, 2025