Thread pool for Linux file I/O a bad idea? #28

bminer · 2014-12-03T19:54:15Z

I was just thinking about this... maybe the thread pool in libuv isn't the answer for file IO and DNS lookup operations in Linux.

@indutny - As you mentioned here (joyent/libuv#1181), the thread pool can fill with requests quite easily, especially when getaddrinfo is being called during periods of poor network connectivity. Suddenly, applications using libuv (i.e. Node) can become completely unresponsive. If the application needs to have somewhat real-time access to other I/O (i.e. reading/writing a file before a certain amount of time expires), the wheels start to fall off.

Consider an HTTP web server that's about to experience some network difficulty, for example. Suppose you enforce a time limit for requests (even as little as 100 ms) to serve mostly static HTML files. If you have other HTTP requests that perform DNS lookups (i.e. your web server also serves as a proxy), you can potentially fill the thread queue with getaddrinfo requests and cause a lot of HTTP requests to fail. That is, since the thread queue is filled with slow DNS requests, the server can't even read the static HTML files from the disk and serve its other requests. Timeouts occur, and the server becomes unresponsive.

Also consider an application that reads data from a serial port and sends the data to a remote server. If DNS requests to the remote server clog up the thread pool, the serial port read operation might not happen in time (the OS internal buffers could start to fill up and overflow).

That said, I understand that non-blocking file I/O in Linux is a pain in the rear. I'd be happy to take this conversation offline, but are there any thoughts on moving forward?

The text was updated successfully, but these errors were encountered:

bminer · 2014-12-03T20:12:12Z

Alternatively, maybe file I/O and DNS lookups should be on separate thread pools? Or, DNS lookups should only use max. 3 of the 4 threads, leaving at least 1 for someone else?

As this guy aptly points out, using non-blocking file I/O over blocking file I/O doesn't really affect responsiveness. If the disk is running slow, your app is going to feel unresponsive. The problem here is mixing other I/O on the same thread pool.

Does that make sense? Thoughts?

txdv · 2014-12-03T20:19:48Z

DNS lookups can be implemented on top of the udp implementation of libuv and it will be completely scalable. If you rely on a lot of dns lookups, you should consider just using a C lib which implements a reasonable API for libuv to consume. An example is here http://25thandClement.com/~william/projects/libdns.git

However, when it comes to file IO, we need an OS mechanism to deal with it or we are left to use a thread pool with blocking calls. We are currently using a threadpool for linux and unix, because the default implementation is not sufficient enough. If we look at linux AIO, there is still some concern that deep down the operations are blocking, as in the current implementation is not yet non blocking, because nobody implemented it in the right way.

bminer · 2014-12-03T21:31:46Z

@txdv - Node.js, for example, utilizes both C-Ares and getaddrinfo for DNS lookups. I (and the Node team) prefer the getaddrinfo route because it utilizes the OS resolver (and considers things like /etc/resolv.conf in Linux world). Deciding how to implement DNS is not the point here.

My point is that getaddrinfo is a blocking system call. So are file I/O operations. Therefore, we have the thread pool with blocking calls.

This brings me to my last comment... why not have separate thread pools? DNS/network and filesystem resources are two separate resources. Why not have their blocking I/O calls in separate thread pools? Or, perhaps, some hacky-hybrid solution of limiting the number of threads that a given resource-specific I/O request can use (i.e. getaddrinfo calls can only use up 2 of the 4 threads in the pool)?

I also agree that the alternatives (i.e. Linux AIO) are still concerning... and would require some significant work. What I propose about using separate thread pools should be relatively simple, right?

bminer · 2014-12-03T21:32:42Z

Maybe I should rename the title of this issue? :)

txdv · 2014-12-03T22:54:59Z

Having separate thread pools is already proposed:

#7

https://github.com/saghul/leps/blob/768747269f445f519ff75fd72f49f5661f29d5fb/XXX-threadpool-handle.md

Even lets you select in which thread pool you want to your specific request.

Furthermore, reading resolv.conf is not a hard part - glibc implements it simply ... by just reading it. There is no kernel level magic behind it.

But yeah, what you propose has already been thought through by @saghul

bminer · 2014-12-03T23:52:03Z

Glad to hear that it's already in the works... or at least that a proposal to fix the problem is in the works.

When will it land downstream in Node.js? (just kidding... sort of...)

saghul · 2014-12-04T08:29:46Z

Hi @bminer the threadpool has always been a bit of a problem. Your reasoning is sound, it's small and used for too many things. Some solutions were attempted in the past, but now I took a different approach in libuv/leps#4.

It's not yet final, but I hope we can land something following the model proposed there: have a handle which can do the work and push the responsibility to the user. I believe it's just not possible for us to cover all possible cases by trying to be smar in libuv, so we pass the burden to the user :-) You want a huge threadpool for file ops, fine! you want a different one for getaddrinfo / getnameinfo, cool too! So, I think everyone wins if we it right.

I'm going to close this, please join the discussion libuv/leps#4 if you have feedback on the proposal.

PS: Hopefully we also have a good story for serial devices sooner than later: #19

remove -xnolibs dtrace flag

saghul closed this as completed Dec 4, 2014

brizzbane mentioned this issue Mar 11, 2016

async.c:149: uv__async_io: Assertion `n == sizeof(val)' failed #401

Closed

davisjam mentioned this issue Aug 27, 2018

Threadpool meta-issue #1959

Closed

vtjnash referenced this issue in JuliaLang/libuv Apr 2, 2019

Merge pull request #28 from nwh/julia-uv0.11.26

abcbb0c

remove -xnolibs dtrace flag

tkelman mentioned this issue Apr 2, 2019

Fix test failures from #28, turn on Travis here JuliaLang/libuv#1

Closed

leiless mentioned this issue Jan 20, 2020

stack smashing detected / segmentation fault libuv/help#125

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Thread pool for Linux file I/O a bad idea? #28

Thread pool for Linux file I/O a bad idea? #28

bminer commented Dec 3, 2014

bminer commented Dec 3, 2014

txdv commented Dec 3, 2014

bminer commented Dec 3, 2014

bminer commented Dec 3, 2014

txdv commented Dec 3, 2014

bminer commented Dec 3, 2014

saghul commented Dec 4, 2014

Thread pool for Linux file I/O a bad idea? #28

Thread pool for Linux file I/O a bad idea? #28

Comments

bminer commented Dec 3, 2014

bminer commented Dec 3, 2014

txdv commented Dec 3, 2014

bminer commented Dec 3, 2014

bminer commented Dec 3, 2014

txdv commented Dec 3, 2014

bminer commented Dec 3, 2014

saghul commented Dec 4, 2014