-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Session breaks upon parsing context #103
Comments
Or just save the returned context for each message to be able to start over at a given point when the model starts generating nonsense |
Thanks, I was able to replicate the issue. I'm not entirely sure yet but I suspect is because we are trying to format the completion in "small chunks" as Ollama streams them and one of the chunks is causing the parser to break.
Yeah, I wanted to implement something like this so messages can be retried or even edited. #9 |
Here's what I think it's happening, when we catch an error during a completion (any kind of error) the prompt is reset so the user can try again, but when it re-submits the prompt the context array with all of the tokens is lost. |
I don't think we can predict when the JSON parsing will fail, but whenever there is any kind of error the UI should hint a clear path on how to fix it (if possible). In a large number of cases I expect simply "retrying" should fix most issues: #9 (comment) |
#9 doesn't help at all for this problem. Regenerating the failing message or even one earlier will cause the same error again. |
@binarynoise true, #103 won't be fixed by #106. What I meant is that in order to close this issue we want to make sure you can indeed click "Retry" to continue the session as if the error never happened. |
I can now reliably get this error:
I think this is caused by exceeding the limit of the context window on the model, which causes the completion to get truncated (which breaks the JSON parser). This is what I see in the Ollama logs when I retry the failed message. [GIN] 2024/07/21 - 17:47:48 | 200 | 1.964507833s | ::1 | POST "/api/generate"
INFO [update_slots] input truncated | n_ctx=2048 n_erase=2767 n_keep=4 n_left=2044 n_shift=1022 tid="0x1f6960c00" timestamp=1721598478 |
- All errors can now be retried - Adds a hint to explain the behavior in #103
I just added a hint in the UI about the potential cause of the error: A more final solution to this issue could be to switch to the |
I pushed a quick-and-dirty implementation using the Here's are some early findings:
I'm sort of on the fence on this one, it's nice to be able to trust Ollama to always give us valid JSON responses, but not knowing when earlier messages stop being part of the current context feels like a downgrade. I feel like I'd rather see a "nicer" version of the |
The chat version works great – until you reference something from three pages ago. |
I'm not entirely sure what would be the best way to conserve the overall context as much as possible. But the recent updates to Ollama (tool and web scraping) are only available on the @binarynoise Do you feel like you often exceed the context window limit in normal use? or only occasionally? |
🎉 This issue has been resolved in version 0.7.8 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
This is independent from the knowledge as system prompt experiment.
Here is the network traffic for that session as har file: localhost_Archive [24-07-20 17-45-41].har.json. You can load it into Firefox or Chrome to see the details.
As far as I can tell, ollama does send a valid response (in contrast to #97) but hollama for some reason fails to parse the last line with the context. The x varies wildly
around 30k.It happens around the 5th or 6th response for me with different models and prompts.
As a consequence, the conversation's context is lost and a new conversation begins on retrying the last message. Sometimes not everything gets lost but the conversation is broken anyway.
As a workaround, would it be possible to hold onto the last valid context and use that when retrying the message?
Switching to another conversation and coming back to retry does actually trigger #97.
The text was updated successfully, but these errors were encountered: