Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Invalid token/cost calculation #8514

Open
mboret opened this issue Feb 13, 2025 · 8 comments
Open

[Bug]: Invalid token/cost calculation #8514

mboret opened this issue Feb 13, 2025 · 8 comments
Assignees
Labels
bug Something isn't working feb 2025 spend tracking

Comments

@mboret
Copy link

mboret commented Feb 13, 2025

What happened?

Hi,

I've discovered a complete inconsistency between the cost reported by litellm and AWS Bedrock using the Claude 3.5 Sonnet V2 model. Based on my tests, the cost definition is correct: when performing a standard request (chat completion) and analyzing the request-response, the prompt token count looks coherent and the related price is correct according to AWS pricing.

My issue seems to occur when using Roo Code. I think the input token count from litellm is incorrect.

Details of my test (with a new virtual key):

  1. Standard request (e.g., simple question "Explain to me in detail how decorators work in Python")
    x-litellm-response-cost: 0.018414
    completion_tokens: 1224
    prompt_tokens: 18

In the UI: 0.018

  1. With Roo Code, I asked it to improve a file
    Roo Code Input Tokens: 56.1k
    Roo Code Output Tokens: 4.9k
    Roo Code Cost: $0.2418

litellm UI (total cost for my virtual key): 0.0915 (less my previous request cost - 0.018) => $0.0735

I don't know what Roo Code is doing, but I suppose the issue is related to the litellm token count level.

For the current month, litellm is reporting $26.74 for the Claude 3.5 Sonnet V2, whereas AWS shows $770... I'm pretty sure the issue is related to Roo Code, as previously no one was using this model. But now, multiple users have been utilizing it since they started using Roo Code.

Relevant log output

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.59.9

Twitter / LinkedIn details

No response

@mboret mboret added the bug Something isn't working label Feb 13, 2025
@krrishdholakia
Copy link
Contributor

Hi @mboret what do your spend logs show for a bedrock call made via roocode?

And can you try bumping and see if that fixes it?

cc: @ishaan-jaff

@PradyMagal
Copy link

Going to begin work on this.

@mboret
Copy link
Author

mboret commented Feb 19, 2025

After doing some tests, I'm more confused...

I've created a local test (docker container) with claude-3.5-sonnet-v2 and configured my Roo Code to use it. I've tested a simple question and file edition. The calculated prices look... correct. I'm using the same Litellm version as in production.

In production, my deployment is in a k8s cluster, 2 replicas, a Redis pod and an RDS PostgreSQL database.

One more thing, in production we are also connected to Azure to use OpenAI models. The reported costs are the same between Litellm and Azure (mainly GPT-4o is used but by other applications, not Roo Code)

@krrishdholakia
Copy link
Contributor

Hey @mboret what do your spend logs for bedrock say in prod vs. the value received clientside?

@mboret
Copy link
Author

mboret commented Feb 19, 2025

@krrishdholakia how do I get the spend log without all the other logs? Because we have multiple requests in production and if I need to set the verbose mode to True... too many logs are generated

@krrishdholakia
Copy link
Contributor

Hey @mboret this is an api endpoint - https://litellm-api.up.railway.app/#/Budget%20%26%20Spend%20Tracking/view_spend_logs_spend_logs_get

you can also check this on your UI (in more recent versions) - ui -> logs tab

@mboret
Copy link
Author

mboret commented Feb 19, 2025

Ah yes, sorry. So the output:

{
  "key": "sk-...MgeQ",
  "info": {
    "key_name": "sk-...MgeQ",
    "key_alias": "test-billing2",
    "soft_budget_cooldown": false,
    "spend": 0.07978800000000001,
    "expires": null,
    "models": [
      "all-team-models"
    ],
    "aliases": {},
    "config": {},
    "user_id": null,
    "team_id": "1106dfa3-5f96-4f52-b8b7-c766e307508f",
    "permissions": {},
    "max_parallel_requests": null,
    "metadata": {},
    "blocked": null,
    "tpm_limit": 10,
    "rpm_limit": 2000,
    "max_budget": 100,
    "budget_duration": "30d",
    "budget_reset_at": "2025-03-21T14:48:58.985000+00:00",
    "allowed_cache_controls": [],
    "model_spend": {},
    "model_max_budget": {},
    "budget_id": null,
    "created_at": "2025-02-19T14:48:58.988000+00:00",
    "updated_at": "2025-02-19T15:23:03.626000+00:00",
    "litellm_budget_table": null
  }
}

(I don't have anymore the initial virtual key. But it's the same configuration for this one, and I still have the incorrect cost after requesting Claude Sonnet v2)

@krrishdholakia
Copy link
Contributor

that's the /key/info

i'm asking you for the spend log for that call via openwebui

so we can validate what the litellm values are

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feb 2025 spend tracking
Projects
None yet
Development

No branches or pull requests

4 participants