Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stream.get_final_message() does not return the correct usage of output_tokens #424

Closed
sirius422 opened this issue Mar 28, 2024 · 9 comments

Comments

@sirius422
Copy link

Like the title said, stream.get_final_message() always return the output_tokens with the value of 1.
running the exmaple code examples/messages_stream.py, the output would look like:

Hello there!
accumulated message:  {
  "id": "REDACTED",
  "content": [
    {
      "text": "Hello there!",
      "type": "text"
    }
  ],
  "model": "claude-3-opus-20240229",
  "role": "assistant",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "type": "message",
  "usage": {
    "input_tokens": 11,
    "output_tokens": 1
  }
}

However the actual output_tokens should be 6 according to the raw HTTP stream response

event: message_start
data: {"type":"message_start","message":{"id":"REDACTED","type":"message","role":"assistant","content":[],"model":"claude-3-opus-20240229","stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":11,"output_tokens":1}}   }

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}     }

event: ping
data: {"type": "ping"}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello"}              }

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" there"}  }

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"!"}            }

event: content_block_stop
data: {"type":"content_block_stop","index":0         }

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":6}               }

event: message_stop
data: {"type":"message_stop"           }

So, is this a bug or a feature? I've seen someone in issue #417 using stream.get_final_message() to obtain the usage information. If output_tokens always returns 1, this won't work properly, I guess.

@sirius422
Copy link
Author

Adding a line under accumulate_event in src/anthropic/lib/streaming/_messages.py file seems to fix the issue. Should I submit a pull request?

Hello there!
accumulated message:  {
  "id": "REDACTED",
  "content": [
    {
      "text": "Hello there!",
      "type": "text"
    }
  ],
  "model": "claude-3-opus-20240229",
  "role": "assistant",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "type": "message",
  "usage": {
    "input_tokens": 11,
    "output_tokens": 6
  }
}

@rattrayalex
Copy link
Collaborator

rattrayalex commented Mar 29, 2024

Thanks for the report & the PR!

@WesleyYue
Copy link

Is there a timeline on this being fixed? What's the blocker on merging the PR?

@rattrayalex
Copy link
Collaborator

This was fixed 3 weeks ago: anthropics/anthropic-sdk-typescript#361

@krschacht
Copy link

I’m wondering if this fix is mistaken. Shouldn’t the output_tokens be the sum of what appears in the message_start (output_token = 1) and the final message_delta (output_token = 6)?

(there was no discussion on the PR so I’m adding this comment to where it seemed like the real discussion was happening)

@rattrayalex
Copy link
Collaborator

@krschacht what behavior are you seeing?

@krschacht
Copy link

@rattrayalex Right now the output_tokens is being set to the final output_token contained within the final message_delta. Is that the total output_token for the full stream? I’ve been assuming we need to sum up the first output_token and the final one to get the total, but the docs don’t actually specify the meaning of those fields.

@RobertCraigie
Copy link
Collaborator

RobertCraigie commented Jul 15, 2024

@krschacht I believe the current behaviour is correct, if you make a non-streaming request with the same inputs, the usage will be the exact same. e.g. an updated version of the messages_stream.py example:

import asyncio
from anthropic import AsyncAnthropic

client = AsyncAnthropic()

async def main() -> None:
    async with client.messages.stream(
        max_tokens=1024,
        messages=[
            {
                "role": "user",
                "content": "Say hello there!",
            }
        ],
        model="claude-3-opus-20240229",
    ) as stream:
        await stream.until_done()

    accumulated = await stream.get_final_message()
    print("accumulated message: ", accumulated.to_json())

    api_message = await client.messages.create(
        max_tokens=1024,
        messages=[
            {
                "role": "user",
                "content": "Say hello there!",
            }
        ],
        model="claude-3-opus-20240229",
    )
    print('api message', api_message.to_json())

asyncio.run(main())

Running this script gives this output for me:

accumulated message:  {
  "id": "msg_01JngEuyKqL3QmvpnivBwAv3",
  "content": [
    {
      "text": "Hello there!",
      "type": "text"
    }
  ],
  "model": "claude-3-opus-20240229",
  "role": "assistant",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "type": "message",
  "usage": {
    "input_tokens": 11,
    "output_tokens": 6
  }
}
api message: {
  "id": "msg_01Ng2rJDRPPCHeN2ULLf4BAA",
  "content": [
    {
      "text": "Hello there!",
      "type": "text"
    }
  ],
  "model": "claude-3-opus-20240229",
  "role": "assistant",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "type": "message",
  "usage": {
    "input_tokens": 11,
    "output_tokens": 6
  }
}

I'll let the Anthropic team know that the docs weren't helpful here! Which docs were you looking at?

@krschacht
Copy link

Ohh, got it, so that final count is indeed the total and the initial one can be ignored. I wonder why the API includes the initial one then? Anyway, thanks for the quick reply!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants