Session breaks upon parsing context #103

binarynoise · 2024-07-20T16:31:49Z

Sorry, something went wrong.

SyntaxError: JSON.parse: end of data when ',' or ']' was expected at line 1 column x of the JSON data

This is independent from the knowledge as system prompt experiment.

Here is the network traffic for that session as har file: localhost_Archive [24-07-20 17-45-41].har.json. You can load it into Firefox or Chrome to see the details.

As far as I can tell, ollama does send a valid response (in contrast to #97) but hollama for some reason fails to parse the last line with the context. The x varies wildly ~~around 30k~~.
It happens around the 5th or 6th response for me with different models and prompts.

As a consequence, the conversation's context is lost and a new conversation begins on retrying the last message. Sometimes not everything gets lost but the conversation is broken anyway.

As a workaround, would it be possible to hold onto the last valid context and use that when retrying the message?

Switching to another conversation and coming back to retry does actually trigger #97.

The text was updated successfully, but these errors were encountered:

binarynoise · 2024-07-20T18:52:32Z

Or just save the returned context for each message to be able to start over at a given point when the model starts generating nonsense

fmaclen · 2024-07-20T19:37:01Z

Thanks, I was able to replicate the issue.

I'm not entirely sure yet but I suspect is because we are trying to format the completion in "small chunks" as Ollama streams them and one of the chunks is causing the parser to break.

As a workaround, would it be possible to hold onto the last valid context and use that when retrying the message?

Yeah, I wanted to implement something like this so messages can be retried or even edited. #9

fmaclen · 2024-07-20T23:05:59Z

Here's what I think it's happening, when we catch an error during a completion (any kind of error) the prompt is reset so the user can try again, but when it re-submits the prompt the context array with all of the tokens is lost.

fmaclen · 2024-07-20T23:14:43Z

I don't think we can predict when the JSON parsing will fail, but whenever there is any kind of error the UI should hint a clear path on how to fix it (if possible).

In a large number of cases I expect simply "retrying" should fix most issues: #9 (comment)

binarynoise · 2024-07-21T17:46:52Z

#9 doesn't help at all for this problem. Regenerating the failing message or even one earlier will cause the same error again.

fmaclen · 2024-07-21T20:13:33Z

@binarynoise true, #103 won't be fixed by #106.

What I meant is that in order to close this issue we want to make sure you can indeed click "Retry" to continue the session as if the error never happened.

fmaclen · 2024-07-21T22:06:34Z

I can now reliably get this error:

SyntaxError: JSON.parse: end of data when ',' or ']' was expected at line 1 column x of the JSON data

I think this is caused by exceeding the limit of the context window on the model, which causes the completion to get truncated (which breaks the JSON parser). This is what I see in the Ollama logs when I retry the failed message.

[GIN] 2024/07/21 - 17:47:48 | 200 |  1.964507833s |             ::1 | POST     "/api/generate"
INFO [update_slots] input truncated | n_ctx=2048 n_erase=2767 n_keep=4 n_left=2044 n_shift=1022 tid="0x1f6960c00" timestamp=1721598478

- All errors can now be retried - Adds a hint to explain the behavior in #103

fmaclen · 2024-07-21T23:56:03Z

I just added a hint in the UI about the potential cause of the error:

A more final solution to this issue could be to switch to the /api/chat endpoint (instead of /api/generate) which probably fails more gracefully, but if this error is really only caused by the exceeding tokens in the context window we can probably do other things that would be more useful such as #54 and #7

fmaclen · 2024-07-22T15:15:10Z

I pushed a quick-and-dirty implementation using the /api/chat endpoint:
https://api-chat-endpoint.hollama.pages.dev/

Here's are some early findings:

Appears to avoid the parsing error in long sessions because it has a rolling context window that ignores earlier messages.
The system prompt appears to not be ignored in long sessions 👍 .
This endpoint doesn't return the tokens used as a number[] but instead it returns the total number of tokens as prompt_eval_count, eval_count, total_duration. We should still be able to implement Show session token count #7 and Show response tokens per second rate #8 from these values.

I'm sort of on the fence on this one, it's nice to be able to trust Ollama to always give us valid JSON responses, but not knowing when earlier messages stop being part of the current context feels like a downgrade.

I feel like I'd rather see a "nicer" version of the SyntaxError that prompts the user to "Summarize the session" (#54) and use that as the new context for the current session (or a new one).

binarynoise · 2024-07-26T17:57:24Z

The chat version works great – until you reference something from three pages ago.
I'm wondering if there isn't any way to for either variant to progressively condense the most important stuff into the context instead of keeping everything. I have the feeling that summarizing and starting over could be destructive to the flow of the conversation.

fmaclen · 2024-07-27T01:58:35Z

I'm not entirely sure what would be the best way to conserve the overall context as much as possible.

But the recent updates to Ollama (tool and web scraping) are only available on the /api/chat endpoint so it's probably worth upgrading to it, close this issue and figure out how the context window issue separately.

@binarynoise Do you feel like you often exceed the context window limit in normal use? or only occasionally?

fmaclen · 2024-07-29T14:25:41Z

🎉 This issue has been resolved in version 0.7.8 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

fmaclen added the bug Something isn't working label Jul 20, 2024

fmaclen mentioned this issue Jul 20, 2024

Add ability to retry an AI completion #9

Closed

fmaclen mentioned this issue Jul 21, 2024

feat: add retry button to error messages #110

Merged

fmaclen added a commit that referenced this issue Jul 21, 2024

feat: add retry button to error messages (#110)

50d9769

- All errors can now be retried - Adds a hint to explain the behavior in #103

This was referenced Jul 27, 2024

fix: Disable Run button if server is disconnected or no model is chosen #99

Merged

fix: get completion from /api/chat endpoint (with ollama-js) #125

Merged

fmaclen closed this as completed in #125 Jul 29, 2024

fmaclen closed this as completed in 83f8a28 Jul 29, 2024

fmaclen added the released label Jul 29, 2024

fmaclen mentioned this issue Aug 17, 2024

fix: revert ollama-js implementation with fetch #156

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Session breaks upon parsing context #103

Session breaks upon parsing context #103

binarynoise commented Jul 20, 2024 •

edited

Loading

binarynoise commented Jul 20, 2024

fmaclen commented Jul 20, 2024 •

edited

Loading

fmaclen commented Jul 20, 2024

fmaclen commented Jul 20, 2024

binarynoise commented Jul 21, 2024

fmaclen commented Jul 21, 2024

fmaclen commented Jul 21, 2024 •

edited

Loading

fmaclen commented Jul 21, 2024 •

edited

Loading

fmaclen commented Jul 22, 2024

binarynoise commented Jul 26, 2024 •

edited

Loading

fmaclen commented Jul 27, 2024

fmaclen commented Jul 29, 2024

Session breaks upon parsing context #103

Session breaks upon parsing context #103

Comments

binarynoise commented Jul 20, 2024 • edited Loading

binarynoise commented Jul 20, 2024

fmaclen commented Jul 20, 2024 • edited Loading

fmaclen commented Jul 20, 2024

fmaclen commented Jul 20, 2024

binarynoise commented Jul 21, 2024

fmaclen commented Jul 21, 2024

fmaclen commented Jul 21, 2024 • edited Loading

fmaclen commented Jul 21, 2024 • edited Loading

fmaclen commented Jul 22, 2024

binarynoise commented Jul 26, 2024 • edited Loading

fmaclen commented Jul 27, 2024

fmaclen commented Jul 29, 2024

binarynoise commented Jul 20, 2024 •

edited

Loading

fmaclen commented Jul 20, 2024 •

edited

Loading

fmaclen commented Jul 21, 2024 •

edited

Loading

fmaclen commented Jul 21, 2024 •

edited

Loading

binarynoise commented Jul 26, 2024 •

edited

Loading