Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Unexpected end of JSON input" when streaming on edge environments (Vercel Edge, Cloudflare Workers) #292

Closed
venables opened this issue Feb 14, 2024 · 16 comments

Comments

@venables
Copy link

The SDK seems to operate fine when running in a node.js environment, but when running in an Edge runtime (browser env), such as Vercel Edge or Cloudflare Workers, streaming becomes cut off with the following exception:

Could not parse message into JSON: 
From chunk: [ 'event: content_block_delta' ]

SyntaxError: Unexpected end of JSON input
    at (node_modules/@anthropic-ai/sdk/streaming.mjs:58:39)
    at (app/api/test/route.js:15:19)
    at (node_modules/next/dist/esm/server/future/route-modules/app-route/module.js:189:36)
    at (node_modules/next/dist/esm/server/future/route-modules/app-route/module.js:128:25)
    at (node_modules/next/dist/esm/server/future/route-modules/app-route/module.js:251:29)
    at (node_modules/next/dist/esm/server/web/edge-route-module-wrapper.js:81:20)
    at (node_modules/next/dist/esm/server/web/adapter.js:157:15)

The error is coming from this block: https://github.com/anthropics/anthropic-sdk-typescript/blob/main/src/streaming.ts#L69-L84

The line content is:

{
  event: 'content_block_delta',
  data: '',
  raw: [ 'event: content_block_delta' ]
}

Since the data is an empty string,, the JSON parsing blows up. I can bypass this error if I modify the code to ignore empty strings, but that does not seem ideal.

Reproduction repos:

I put the Streaming example from the Anthropic SDK README into a Vercel Edge function and a Cloudflare Workers function with the same failing result.

Note, the error occurs whether we use import "@anthropic-ai/sdk/shims/web"; or not.

Vercel Edge:

I've put together a sample repo, using create-next-app and using the example from your README: https://github.com/venables/anthropic-edge-stream-error

The file in question would be app/api/test/route.ts. If you remove export const runtime = "edge", it works as expected.

This error will not occur locally since locally the environment is a node.js environment, but when you deploy to Vercel (with runtime = "edge" still in the code), you will consistently get the error.

Cloudflare Workers

If you want to reproduce this locally, you can do so using Wrangler and Cloudflare Workers, which spins up a real edge-like environment locally when you run it.

I created a sample repository here, using Hono as the router: https://github.com/venables/anthropic-stream-error-cf

The file in question here is src/index.ts

Running that locally and hitting the endpoint will fail.

@rattrayalex
Copy link
Collaborator

Thanks for reporting!

cc @RobertCraigie can you take a look at this?

@izuchukwu
Copy link

izuchukwu commented Mar 5, 2024

Hi y'all! @rattrayalex @RobertCraigie We're experiencing this same issue in Node.js environments as well. We've confirmed it in both Node.js and Bun after migrating to the Messages API today to adopt Claude 3. We don't experience this with the legacy Text Completions streaming API.

Update! It was not exactly the same issue. Instead - the new .stream API consistently fails for us with this same error, but we found we can use the old .create({stream: true, ...}) API with the new Messages API, and this works fine on Node.js & Bun. So if .stream fails for you in Node environments, .create({stream: true, ...}) still works. Hope this helps someone else!

@rattrayalex
Copy link
Collaborator

rattrayalex commented Mar 5, 2024

Thank you for the update @izuchukwu ! We'll take a look at it shortly!

Could you share a script, ideally including prompt, that reproduces the problem you're seeing?

EDIT: we were not able to reproduce this locally.

@izuchukwu
Copy link

izuchukwu commented Mar 5, 2024

Hi, unfortunately, we're now running into this issue with .create({stream: true, ...}) as well.

It's hard for me to identify the prompt because it happens when we run multiple large prompts in parallel. I can confirm it is the same problem - sse.event is either content_block_delta or content_block_start, but sse.data is an empty string, not a JSON object, so the yield JSON.parse(sse.data) line throws an error.

We consistently see the error, but when we see it is very inconsistent, presumably because its dependent on the server sending an empty string. We're able to run small prompts just fine. The prompts that trigger it write multiple paragraphs (~5) before tripping the error.

Here's a snippet. It's embedded in a library, so most options are passed as variables.

const stream = await client.messages.create({
    messages,
    model,
    max_tokens: 4096,
    temperature: options?.temp,
    top_p: options?.topP,
    system,
    stream: true
})

for await (const event of stream) {
  if (event.type !== 'content_block_delta') continue
  const chunk = event.delta.text
  
  // Process stream
  const shouldContinue = await onComplete?.(chunk) // onComplete is an async callback function
  if (!_.isNil(shouldContinue) && !shouldContinue) {
	  stream.controller.abort()
  }
}

@lawetis
Copy link

lawetis commented Mar 5, 2024

Very sorry, I also encountered the same problem here.

@stephtr
Copy link

stephtr commented Mar 5, 2024

As venables said, patching the streaming.js file (see below) keeps the stream from aborting. Interestingly, nevertheless a few tokens go then missing in the response. (edit: that also happens without the patch)

EDIT: for a fix, see the response from dzhng

@tvytlx
Copy link

tvytlx commented Mar 5, 2024

Any update on this issue? @rattrayalex

@RobertCraigie
Copy link
Collaborator

Hey, would it be possible for anyone to share a request ID for a request that failed in this way? We can't reproduce unfortunately.

You can get a request ID by setting the DEBUG env var to true and in your logs you'll see something like this:

Anthropic:DEBUG:response 200 https://api.anthropic.com/v1/messages Headers {
  [Symbol(map)]: [Object: null prototype] {
    date: [ 'Tue, 05 Mar 2024 17:54:09 GMT' ],
    'content-type': [ 'text/event-stream; charset=utf-8' ],
    'transfer-encoding': [ 'chunked' ],
    connection: [ 'keep-alive' ],
    'cache-control': [ 'no-cache' ],
    'request-id': [ 'req_01Vr3DL4pMCDk2kNHujJkpwf' ],
    via: [ '1.1 google' ],
    'cf-cache-status': [ 'DYNAMIC' ],
    server: [ 'cloudflare' ],
    'cf-ray': [ '85fbf82b8be16aaa-MAN' ]
  }
}

In this case the request ID was req_01Vr3DL4pMCDk2kNHujJkpwf.

@nyacg
Copy link

nyacg commented Mar 5, 2024

Unfortunately when I debug log on Vercel the headers aren't printed 😞

Anthropic:DEBUG:response 200 https://api.anthropic.com/v1/messages Headers { } ReadableStream { }

I also applied the suggested patch #292 (comment) which seems to reduce the frequency of error rates but not bring it down to zero
Edit: I don't think the patch was being applied properly, it now is and I'm getting dropped tokens

@nyacg
Copy link

nyacg commented Mar 5, 2024

Doing some debugging myself it looks like a potential bug in the LineDecoder

Here's an extract of the logs where an error occurs. I'm printing the output of this.decodeText(chunk); in the LineDecoder.

Note: logs go from bottom to top

ata: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"ky"}}
event: content_block_delta d
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" with"}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" its"}} event: content_block_delta data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":" smo"}}
ERROR in iterMessages, sse:  '{"event":"content_block_delta","data":"","raw":["event: content_block_delta"]}' 
event: content_block_delta
ata: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":","}}
event: content_block_delta d
ata: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"uda"}}
event: content_block_delta d
ata: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"\n\nGo"}}

We get the output Gouda, its smoky i.e. the " with" delta is dropped

Generally the chunks that the LineDecoder.decode gets fed are either

  1. event: content_block_delta d
    then
  2. ata: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"<the delta>"}}

However an error occurs when the first chunk is just event: content_block_delta (possibly with a trailing space). This leads to an SSE with a sse.data of an empty string which would usually throw an error. If we just continue instead of throwing an error we then get the next SSE with the delta after this one

@dzhng
Copy link

dzhng commented Mar 6, 2024

OK I fixed it, @nyacg was on the right track, the issue is in the LineDecoder class in streaming.ts.

Specifically, it's because the decode() method is directly ported from this python implementation, but missed an important behavior difference in js's split() method vs python's splitlines() method.

The error happens when decode() receives the any inputs that ends in a new line, e.g.:
event: content_block_delta\r\n OR \r\n

In both of these cases, js's split() method will add an extra empty string to the end of the lines array, whereas python's splitlines() method will not. This is causing empty lines to be passed through to the next layer of sse decoders, causing this issue.

This is also why the previous patch doesn't work, it dropped tokens because the extra empty line caused whole data packets to be ignored.

It's got nothing to do with edge env, I suspect some network config in edge causes SSE packets to be smaller making this issue more noticeable.

For now, you can patch the package really easily:

--- a/streaming.js
+++ b/streaming.js
@@ -266,6 +266,9 @@ class LineDecoder {
         }
         const trailingNewline = LineDecoder.NEWLINE_CHARS.has(text[text.length - 1] || '');
         let lines = text.split(LineDecoder.NEWLINE_REGEXP);
+        if (trailingNewline) {
+            lines.pop();
+        }
         if (lines.length === 1 && !trailingNewline) {
             this.buffer.push(lines[0]);
             return [];

This will account for the different split() behavior and align it with python's splitlines() behavior.

I already created a patch for my llm-api lib if anyone want the patch files. Commit for patch

@RobertCraigie
Copy link
Collaborator

Ahh thank you so much for the detailed investigation and proposed patch @dzhng! We'll test and port this over to our side ASAP.

@rattrayalex
Copy link
Collaborator

rattrayalex commented Mar 6, 2024

Fixed in #312 which should be released shortly.

@RobertCraigie
Copy link
Collaborator

This fix was released in v0.17.0!

@izuchukwu
Copy link

Amazing turnaround time on this, thank you both @rattrayalex & @RobertCraigie

@rattrayalex
Copy link
Collaborator

Thank you for the details, help, and patience @izuchukwu , @dzhng , @nyacg , @venables , and others!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants