Fetch times out on low end devices serving OLLAMA #612

ManuXD32 · 2024-07-31T16:29:38Z

Description

big-AGI uses fetch to retrieve API responses, the timeout is around 5 minutes so when low end devices like a phone serves ollama and has to process a big context, it just times out.

Device and browser

Android Redmagic 9 pro serving ollama and big-AGI, browser: Brave

Screenshots and more

No response

Willingness to Contribute

🙋‍♂️ Yes, I would like to contribute a fix.

enricoros · 2024-07-31T19:58:29Z

Thanks @ManuXD32 - we have a fully new networking stack for Big Agi 2 which in principles would allow to send ping packets to keep the connection alive. This may help, we'd need to test it.
Explain more: is this for a streaming text generation request, and is it the initial fetch operation? Where is ollama running, and where are big-agi server and client running?

ManuXD32 · 2024-07-31T20:50:01Z

Thanks @ManuXD32 - we have a fully new networking stack for Big Agi 2 which in principles would allow to send ping packets to keep the connection alive. This may help, we'd need to test it. Explain more: is this for a streaming text generation request, and is it the initial fetch operation? Where is ollama running, and where are big-agi server and client running?

Ollama and big-AGI (server and client) are running on the phone.
It is for streaming text generation and it happens with the firs fetch operation if the context is very large and within the conversation if it becomes too long (it takes longer to answer because the model needs to process more tokens).

Thank you so much for your answer and work, I'm really loving it and truly enjoy running it on my server :)

enricoros · 2024-07-31T21:53:06Z

@ManuXD32 Oh wow thanks for the answer. How did you manage to run big-AGI fully on the phone? I want to try that out.
For the timeout, I believe that with sending pings I could solve it.

ManuXD32 · 2024-08-01T05:34:33Z

@ManuXD32 Oh wow thanks for the answer. How did you manage to run big-AGI fully on the phone? I want to try that out. For the timeout, I believe that with sending pings I could solve it.

Cool, I think ping is a nice approach.

Here is a little guide on how to install it on android. Honestly it is so straight forward that I thought it was supported and I just didn't know it hahahah.

I used proot-distro and termux:

Install proot-distro:
apt update && apt upgrade -y && apt install proot-distro -y
2. Install the ubuntu distro:
pd install --override-alias agi ubuntu
3. Enter the distro:
pd login agi
Install packages and clone the repo:
apt update && apt upgrade -y && apt install nodejs -y
- use git clone to clone the repo and navigate there with cd
Follow the setup guide
npm install
npm run build
npx next start --port 3000

michieal · 2024-12-16T10:10:54Z

@enricoros Hi there! this seems to be an issue when running on local hardware - ie., not just phones. Maybe adding in a setting to change the timeouts?
I've experienced this issue on both LocalAI and Ollama, when the conversation history becomes larger (about 4k tokens) my Ryzen 7 (5900) takes too long to process the context and input. Neither ever gets close to the context window of 16k tokens.
[Service Issue] Ollama: fetch failed · {"name":"HeadersTimeoutError","code":"UND_ERR_HEADERS_TIMEOUT","message":"Headers Timeout Error"} [DEV_URL: http://127.0.0.1:11434/api/chat] is the error message that I receive, but looking at System Monitor (Ubuntu/Plasma) I can tell that both Ollama and LocalAI are still processing the request.
Now, if I delete messages out of the conversation history in the chat, it works just fine.

This tells me, as a fellow developer (Appreciate your work, btw!) that the interface (Big-AGI) is timing out and reporting an error before the AI programs are finished processing the input stream.

Additional info: I have had this issue with v1 Dev, v1 Stable, and v2 Dev.

enricoros · 2024-12-18T15:16:17Z

Hi @michieal was this with the latest v2-dev branch?

It's the backend part of big-AGI (nodejs) timing out after 5 minutes of not receiving the headers of the http request. This happens in the deep of the network library (undici) of nodejs: nodejs/node#46375

Some people say it can be fixed other don't. I'd welcome a patch that tries to raise the network timeouts of the upstream fetch operation (src/modules/aix library, search for the "fetch(").

michieal · 2024-12-18T16:29:09Z

Hi @michieal was this with the latest v2-dev branch?

The News page says that it's version "Big-AGI has been updated to version 1.16.8" though, when I used git to grab it, I swear that I was grabbing the V2 Dev branch.

It's the backend part of big-AGI (nodejs) timing out after 5 minutes of not receiving the headers of the http request. This happens in the deep of the network library (undici) of nodejs: nodejs/node#46375

Some people say it can be fixed other don't. I'd welcome a patch that tries to raise the network timeouts of the upstream fetch operation (src/modules/aix library, search for the "fetch(").

I would love to be able to create a patch for this... but, it would be me blindly following what the AI said. I'm more of a desktop application / game development guy. I'll look at the code, but... I am probably gonna look like a monkey scratching his head. 🤣

michieal · 2024-12-18T17:26:01Z

It looks like that fetch could be changed... Since that has a hard coded timeout, maybe change it to use something that has the ability to specify a time out. I saw Abort Signal (instead of Abort Control) in the referenced issues, and when I asked the Deepseek coder ai how to fix it, it kept telling me to use Abort Control to manage the timeout. Still, it's pretty greek to me, so I don't know how to fix it... hopefully you will know what to do with this information.

enricoros · 2024-12-19T08:52:02Z

@michieal I took a quick look (trpc.router.fetchers.ts line 74 "response = await fetch(url, request);")

This is a quick patch I put together:

Index: src/server/trpc/trpc.router.fetchers.ts
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/server/trpc/trpc.router.fetchers.ts b/src/server/trpc/trpc.router.fetchers.ts
--- a/src/server/trpc/trpc.router.fetchers.ts	(revision 79c71a174088449776fef140d6841a444bca80dc)
+++ b/src/server/trpc/trpc.router.fetchers.ts	(date 1734598201566)
@@ -2,6 +2,7 @@
 
 import { debugGenerateCurlCommand, safeErrorString, SERVER_DEBUG_WIRE } from '~/server/wire';
 
+import { Agent as UndiciAgent, fetch, RequestInit, Response } from 'undici';
 
 //
 // NOTE: This file is used in the server-side code, and not in the client-side code.
@@ -13,7 +14,7 @@
 
 // JSON fetcher
 export async function fetchJsonOrTRPCThrow<TOut extends object = object, TBody extends object | undefined = undefined>(config: RequestConfig<TBody>): Promise<TOut> {
-  return _fetchFromTRPC<TBody, TOut>(config, async (response) => await response.json(), 'json');
+  return _fetchFromTRPC<TBody, TOut>(config, async (response) => await response.json() as Promise<TOut>, 'json');
 }
 
 // Text fetcher
@@ -68,6 +69,11 @@
       headers: headers !== undefined ? headers : undefined,
       body: body !== undefined ? JSON.stringify(body) : undefined,
       signal: signal !== undefined ? signal : undefined,
+      dispatcher: new UndiciAgent({
+        connectTimeout: 15 * 60 * 1000,   // 15 min
+        headersTimeout: 15 * 60 * 1000,   // 15 min
+        bodyTimeout: 15 * 60 * 1000,      // 15 min
+      }),
     };
 
     // upstream fetch

Unfortunately I don't have time to test this thoroughly for now (as it impacts 15 model providers and more functions, this is core logic). The patch could also break in some other type conversations.

Leaving this here for reference, but for now I think we'll need to tolerate the 5 min timeout unless someone else wants to take a stab at this.

michieal · 2024-12-20T18:39:22Z

@enricoros How do I install the support for Undici? when I made the code changes, it gave me an error that it wasn't found... I figured that the least that I could do was to try this out and test it for Ollama and LocalAI for you.

@ everyone - if you make a pr for this, tag me in the comments, so that I can help test it. TIA!

ManuXD32 added the type: bug Something isn't working label Jul 31, 2024

enricoros added the requested-info label Jul 31, 2024

enricoros added feature-cool Distinctive features and removed requested-info labels Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fetch times out on low end devices serving OLLAMA #612

Fetch times out on low end devices serving OLLAMA #612

ManuXD32 commented Jul 31, 2024 •

edited

Loading

enricoros commented Jul 31, 2024

ManuXD32 commented Jul 31, 2024

enricoros commented Jul 31, 2024

ManuXD32 commented Aug 1, 2024 •

edited

Loading

michieal commented Dec 16, 2024

enricoros commented Dec 18, 2024

michieal commented Dec 18, 2024

michieal commented Dec 18, 2024

enricoros commented Dec 19, 2024

michieal commented Dec 20, 2024 •

edited

Loading

Fetch times out on low end devices serving OLLAMA #612

Fetch times out on low end devices serving OLLAMA #612

Comments

ManuXD32 commented Jul 31, 2024 • edited Loading

Description

Device and browser

Screenshots and more

Willingness to Contribute

enricoros commented Jul 31, 2024

ManuXD32 commented Jul 31, 2024

enricoros commented Jul 31, 2024

ManuXD32 commented Aug 1, 2024 • edited Loading

michieal commented Dec 16, 2024

enricoros commented Dec 18, 2024

michieal commented Dec 18, 2024

michieal commented Dec 18, 2024

enricoros commented Dec 19, 2024

michieal commented Dec 20, 2024 • edited Loading

ManuXD32 commented Jul 31, 2024 •

edited

Loading

ManuXD32 commented Aug 1, 2024 •

edited

Loading

michieal commented Dec 20, 2024 •

edited

Loading