Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming with tools #454

Closed
dested opened this issue Jun 30, 2024 · 11 comments
Closed

Streaming with tools #454

dested opened this issue Jun 30, 2024 · 11 comments

Comments

@dested
Copy link

dested commented Jun 30, 2024

Is it possible to receive a real time stream for tools? It returns the proper values but the actual message stream seems to pause for quite a while before dumping the entire JSON value. Is this a limitation?

@Moe03
Copy link

Moe03 commented Jul 4, 2024

+1 really important, on my end if a tool is used it just crashes the whole stream, with gpt-4o streaming with tools it never does that.

@dgellow
Copy link

dgellow commented Jul 5, 2024

Hi, the SDK team is looking into this issue

@rattrayalex
Copy link
Collaborator

on my end if a tool is used it just crashes the whole stream

sorry, @Moe03 can you provide an example that reproduces this?

@Ethan-Arrowood
Copy link

Hi folks, can you please provide a minimal reproduction of the issue? I've tried the Tool Streaming example here: https://github.com/anthropics/anthropic-sdk-typescript/blob/a1c2fbd4a0d6772de55fb86bc54baeadc2252d4a/examples/tools-streaming.ts and it does not exhibit any delays

@Moe03
Copy link

Moe03 commented Jul 9, 2024

on my end if a tool is used it just crashes the whole stream

sorry, @Moe03 can you provide an example that reproduces this?

@Ethan-Arrowood @rattrayalex

The following does stream fine, if you tell it "hello" it will respond without issues, but then let the AI use any tool like in the following example when i ask it for the weather and the stream crashes instantly after using the weather tool:

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
    apiKey: process.env.ANTHROPIC_API_KEY
});

async function main() {
    await client.messages.stream({
        messages: [{ role: 'user', content: "What is the weather in San Fransisco?" }],
        model: 'claude-3-5-sonnet-20240620',
        max_tokens: 1024,
        "tools": [
            {
                "name": "get_weather",
                "description": "Get the current weather in a given location",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "location": {
                            "type": "string",
                            "description": "The city and state, e.g. San Francisco, CA"
                        }
                    },
                    "required": ["location"]
                }
            }
        ],
        "stream": true
    }).on('text', (text) => {
        console.log(text);
    });
}
main()

I've also debugged on the REST api with curl and the same error happened:

  curl https://api.anthropic.com/v1/messages \
    -H "content-type: application/json" \
    -H "x-api-key: ANTHROPIC_API_KEY" \
    -H "anthropic-version: 2023-06-01" \
    -d '{
      "model": "claude-3-5-sonnet-20240620",
      "max_tokens": 1024,
      "tools": [
        {
          "name": "get_weather",
          "description": "Get the current weather in a given location",
          "input_schema": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "The city and state, e.g. San Francisco, CA"
              }
            },
            "required": ["location"]
          }
        }
      ],
      "tool_choice": {"type": "any"},
      "messages": [
        {
          "role": "user",
          "content": "What is the weather like in San Francisco?"
        }
      ],
      "stream": true
    }'

result logs:

event: message_start
data: {"type":"message_start","message":{"id":"msg_01AzWMEwFPZDVhm4zqL6yK83","type":"message","role":"assistant","model":"claude-3-5-sonnet-20240620","content":[],"stop_reason":null,"stop_sequence":null,"usage":{"input_tokens":366,"output_tokens":8}}            }

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"tool_use","id":"toolu_01CMgbUhe49X48SKjmhTLoWX","name":"get_weather","input":{}}          }

event: ping
data: {"type": "ping"}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":""}        
 }

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"{\"locatio"}               }

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"n\""}}    

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":": \"San "}    }

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"Franc"}   
       }

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"isco, C"} 
 }

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"input_json_delta","partial_json":"A\"}"}    
   }

event: content_block_stop
data: {"type":"content_block_stop","index":0              }

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"tool_use","stop_sequence":null},"usage":{"output_tokens":41}               }

event: message_stop
data: {"type":"message_stop"     }

I believe this might not be related to the SDK but rather the REST api/model itself? for gpt-4o/any openai model the stream continues without issues after using a tool: it streams the inputs, gets the response from the function then continues the stream as if it has appended more context to itself.

I've found the same logs are also on the docs here: https://docs.anthropic.com/en/api/messages-streaming
so is this intentional? and if it is can we have an option for the AI to continue the stream even after using a tool or does the model always end the stream whenever a tool is used since for our use case we have async tools that require the agent to know the result of that tool to continue further

@Ethan-Arrowood
Copy link

Thank you for the additional information. This seems like an API issue so we will route this to the Anthropic team.

@aaron-lerner
Copy link

I think there's 2 different issues in here?

The first issue is pausing/buffering:

Is it possible to receive a real time stream for tools? It returns the proper values but the actual message stream seems to pause for quite a while before dumping the entire JSON value. Is this a limitation?

Unfortunately, you may notice the effects of some buffering while streaming tools. This is a side effect of how our models use tools and we hope to improve this in the future.

The second talks about the stream crashing:

+1 really important, on my end if a tool is used it just crashes the whole stream

The following does stream fine, if you tell it "hello" it will respond without issues, but then let the AI use any tool like in the following example when i ask it for the weather and the stream crashes instantly after using the weather tool:

Looking at that example, I'm not seeing any crashing. The model calls a tool, then you get a message_delta event showing token usage, and then message_stop to indicate it's done. I don't see any errors and it looks like the curl exited cleanly. Can you clarify what "crashing" you're seeing?

@dested
Copy link
Author

dested commented Jul 10, 2024

For the record, I solve this by just passing the JSONSchema in the system prompt and not using the tool functionality directly when streaming.

@Moe03
Copy link

Moe03 commented Jul 11, 2024

I think there's 2 different issues in here?

The first issue is pausing/buffering:

Is it possible to receive a real time stream for tools? It returns the proper values but the actual message stream seems to pause for quite a while before dumping the entire JSON value. Is this a limitation?

Unfortunately, you may notice the effects of some buffering while streaming tools. This is a side effect of how our models use tools and we hope to improve this in the future.

The second talks about the stream crashing:

+1 really important, on my end if a tool is used it just crashes the whole stream

The following does stream fine, if you tell it "hello" it will respond without issues, but then let the AI use any tool like in the following example when i ask it for the weather and the stream crashes instantly after using the weather tool:

Looking at that example, I'm not seeing any crashing. The model calls a tool, then you get a message_delta event showing token usage, and then message_stop to indicate it's done. I don't see any errors and it looks like the curl exited cleanly. Can you clarify what "crashing" you're seeing?

I believe this is the intended behavior then ty for clarifying

@rekdt
Copy link

rekdt commented Jul 16, 2024

For the record, I solve this by just passing the JSONSchema in the system prompt and not using the tool functionality directly when streaming.

What does the json schema look like? Are you including both tool_use and tool_results?

@dested
Copy link
Author

dested commented Jul 18, 2024

For the record, I solve this by just passing the JSONSchema in the system prompt and not using the tool functionality directly when streaming.

What does the json schema look like? Are you including both tool_use and tool_results?

I use zod-to-json-schema to convert zod to JSON schema, then pass it in the system prompt using as JSON.stringify(schema), no tools at all. I give it a little message like "Here is the JSON Schema I want you to respond in". It has worked well for me but your mileage may vary!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants