Streaming with tools #454

dested · 2024-06-30T03:26:27Z

Is it possible to receive a real time stream for tools? It returns the proper values but the actual message stream seems to pause for quite a while before dumping the entire JSON value. Is this a limitation?

Moe03 · 2024-07-04T17:13:45Z

+1 really important, on my end if a tool is used it just crashes the whole stream, with gpt-4o streaming with tools it never does that.

dgellow · 2024-07-05T17:04:17Z

Hi, the SDK team is looking into this issue

rattrayalex · 2024-07-06T21:57:57Z

on my end if a tool is used it just crashes the whole stream

sorry, @Moe03 can you provide an example that reproduces this?

Ethan-Arrowood · 2024-07-09T20:05:00Z

Hi folks, can you please provide a minimal reproduction of the issue? I've tried the Tool Streaming example here: https://github.com/anthropics/anthropic-sdk-typescript/blob/a1c2fbd4a0d6772de55fb86bc54baeadc2252d4a/examples/tools-streaming.ts and it does not exhibit any delays

Moe03 · 2024-07-09T20:24:42Z

on my end if a tool is used it just crashes the whole stream

sorry, @Moe03 can you provide an example that reproduces this?

@Ethan-Arrowood @rattrayalex

The following does stream fine, if you tell it "hello" it will respond without issues, but then let the AI use any tool like in the following example when i ask it for the weather and the stream crashes instantly after using the weather tool:

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
    apiKey: process.env.ANTHROPIC_API_KEY
});

async function main() {
    await client.messages.stream({
        messages: [{ role: 'user', content: "What is the weather in San Fransisco?" }],
        model: 'claude-3-5-sonnet-20240620',
        max_tokens: 1024,
        "tools": [
            {
                "name": "get_weather",
                "description": "Get the current weather in a given location",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA"
                        }
                    },
                    "required": ["location"]
                }
            }
        ],
        "stream": true
    }).on('text', (text) => {
        console.log(text);
    });
}
main()

I've also debugged on the REST api with curl and the same error happened:

  curl https://api.anthropic.com/v1/messages \
    -H "content-type: application/json" \
    -H "x-api-key: ANTHROPIC_API_KEY" \
    -H "anthropic-version: 2023-06-01" \
    -d '{
      "model": "claude-3-5-sonnet-20240620",
      "max_tokens": 1024,
      "tools": [
        {
          "name": "get_weather",
          "description": "Get the current weather in a given location",
          "input_schema": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
              }
            },
            "required": ["location"]
          }
        }
      ],
      "tool_choice": {"type": "any"},
      "messages": [
        {
          "role": "user",
          "content": "What is the weather like in San Francisco?"
        }
      ],
      "stream": true
    }'

result logs:

event: message_start
data: {"type":"message_start","message":{"id":"msg_01AzWMEwFPZDVhm4zqL6yK83","type":"message","role":"assistant","model":"claude-3-5-sonnet-20240620","content":[],"stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":366,"output_tokens":8}}            }

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"tool_use","id":"toolu_01CMgbUhe49X48SKjmhTLoWX","name":"get_weather","input":{}}          }

event: ping
data: {"type": "ping"}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":""}        
 }

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"{\"locatio"}               }

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"n\""}}    

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":": \"San "}    }

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"Franc"}   
       }

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"isco, C"} 
 }

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"A\"}"}    
   }

event: content_block_stop
data: {"type":"content_block_stop","index":0              }

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"tool_use","stop_sequence":null},"usage":{"output_tokens":41}               }

event: message_stop
data: {"type":"message_stop"     }

I believe this might not be related to the SDK but rather the REST api/model itself? for gpt-4o/any openai model the stream continues without issues after using a tool: it streams the inputs, gets the response from the function then continues the stream as if it has appended more context to itself.

I've found the same logs are also on the docs here: https://docs.anthropic.com/en/api/messages-streaming
so is this intentional? and if it is can we have an option for the AI to continue the stream even after using a tool or does the model always end the stream whenever a tool is used since for our use case we have async tools that require the agent to know the result of that tool to continue further

Ethan-Arrowood · 2024-07-10T16:01:01Z

Thank you for the additional information. This seems like an API issue so we will route this to the Anthropic team.

aaron-lerner · 2024-07-10T17:25:44Z

I think there's 2 different issues in here?

The first issue is pausing/buffering:

Is it possible to receive a real time stream for tools? It returns the proper values but the actual message stream seems to pause for quite a while before dumping the entire JSON value. Is this a limitation?

Unfortunately, you may notice the effects of some buffering while streaming tools. This is a side effect of how our models use tools and we hope to improve this in the future.

The second talks about the stream crashing:

+1 really important, on my end if a tool is used it just crashes the whole stream

The following does stream fine, if you tell it "hello" it will respond without issues, but then let the AI use any tool like in the following example when i ask it for the weather and the stream crashes instantly after using the weather tool:

Looking at that example, I'm not seeing any crashing. The model calls a tool, then you get a message_delta event showing token usage, and then message_stop to indicate it's done. I don't see any errors and it looks like the curl exited cleanly. Can you clarify what "crashing" you're seeing?

dested · 2024-07-10T23:15:39Z

For the record, I solve this by just passing the JSONSchema in the system prompt and not using the tool functionality directly when streaming.

Moe03 · 2024-07-11T16:00:00Z

I think there's 2 different issues in here?

The first issue is pausing/buffering:

Is it possible to receive a real time stream for tools? It returns the proper values but the actual message stream seems to pause for quite a while before dumping the entire JSON value. Is this a limitation?

Unfortunately, you may notice the effects of some buffering while streaming tools. This is a side effect of how our models use tools and we hope to improve this in the future.

The second talks about the stream crashing:

+1 really important, on my end if a tool is used it just crashes the whole stream

The following does stream fine, if you tell it "hello" it will respond without issues, but then let the AI use any tool like in the following example when i ask it for the weather and the stream crashes instantly after using the weather tool:

Looking at that example, I'm not seeing any crashing. The model calls a tool, then you get a message_delta event showing token usage, and then message_stop to indicate it's done. I don't see any errors and it looks like the curl exited cleanly. Can you clarify what "crashing" you're seeing?

I believe this is the intended behavior then ty for clarifying

rekdt · 2024-07-16T01:34:13Z

For the record, I solve this by just passing the JSONSchema in the system prompt and not using the tool functionality directly when streaming.

What does the json schema look like? Are you including both tool_use and tool_results?

dested · 2024-07-18T15:45:02Z

For the record, I solve this by just passing the JSONSchema in the system prompt and not using the tool functionality directly when streaming.

What does the json schema look like? Are you including both tool_use and tool_results?

I use zod-to-json-schema to convert zod to JSON schema, then pass it in the system prompt using as JSON.stringify(schema), no tools at all. I give it a little message like "Here is the JSON Schema I want you to respond in". It has worked well for me but your mileage may vary!

aaron-lerner closed this as completed Jul 11, 2024

holdenmatt mentioned this issue Sep 13, 2024

Long delay when using streaming + tools #529

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming with tools #454

Streaming with tools #454

dested commented Jun 30, 2024

Moe03 commented Jul 4, 2024

dgellow commented Jul 5, 2024

rattrayalex commented Jul 6, 2024

Ethan-Arrowood commented Jul 9, 2024

Moe03 commented Jul 9, 2024 •

edited

Loading

Ethan-Arrowood commented Jul 10, 2024

aaron-lerner commented Jul 10, 2024

dested commented Jul 10, 2024

Moe03 commented Jul 11, 2024

rekdt commented Jul 16, 2024

dested commented Jul 18, 2024

Streaming with tools #454

Streaming with tools #454

Comments

dested commented Jun 30, 2024

Moe03 commented Jul 4, 2024

dgellow commented Jul 5, 2024

rattrayalex commented Jul 6, 2024

Ethan-Arrowood commented Jul 9, 2024

Moe03 commented Jul 9, 2024 • edited Loading

Ethan-Arrowood commented Jul 10, 2024

aaron-lerner commented Jul 10, 2024

dested commented Jul 10, 2024

Moe03 commented Jul 11, 2024

rekdt commented Jul 16, 2024

dested commented Jul 18, 2024

Moe03 commented Jul 9, 2024 •

edited

Loading