Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread pool for Linux file I/O a bad idea? #28

Closed
bminer opened this issue Dec 3, 2014 · 7 comments
Closed

Thread pool for Linux file I/O a bad idea? #28

bminer opened this issue Dec 3, 2014 · 7 comments

Comments

@bminer
Copy link

bminer commented Dec 3, 2014

I was just thinking about this... maybe the thread pool in libuv isn't the answer for file IO and DNS lookup operations in Linux.

@indutny - As you mentioned here (joyent/libuv#1181), the thread pool can fill with requests quite easily, especially when getaddrinfo is being called during periods of poor network connectivity. Suddenly, applications using libuv (i.e. Node) can become completely unresponsive. If the application needs to have somewhat real-time access to other I/O (i.e. reading/writing a file before a certain amount of time expires), the wheels start to fall off.

Consider an HTTP web server that's about to experience some network difficulty, for example. Suppose you enforce a time limit for requests (even as little as 100 ms) to serve mostly static HTML files. If you have other HTTP requests that perform DNS lookups (i.e. your web server also serves as a proxy), you can potentially fill the thread queue with getaddrinfo requests and cause a lot of HTTP requests to fail. That is, since the thread queue is filled with slow DNS requests, the server can't even read the static HTML files from the disk and serve its other requests. Timeouts occur, and the server becomes unresponsive.

Also consider an application that reads data from a serial port and sends the data to a remote server. If DNS requests to the remote server clog up the thread pool, the serial port read operation might not happen in time (the OS internal buffers could start to fill up and overflow).

That said, I understand that non-blocking file I/O in Linux is a pain in the rear. I'd be happy to take this conversation offline, but are there any thoughts on moving forward?

@bminer
Copy link
Author

bminer commented Dec 3, 2014

Alternatively, maybe file I/O and DNS lookups should be on separate thread pools? Or, DNS lookups should only use max. 3 of the 4 threads, leaving at least 1 for someone else?

As this guy aptly points out, using non-blocking file I/O over blocking file I/O doesn't really affect responsiveness. If the disk is running slow, your app is going to feel unresponsive. The problem here is mixing other I/O on the same thread pool.

Does that make sense? Thoughts?

@txdv
Copy link
Contributor

txdv commented Dec 3, 2014

DNS lookups can be implemented on top of the udp implementation of libuv and it will be completely scalable. If you rely on a lot of dns lookups, you should consider just using a C lib which implements a reasonable API for libuv to consume. An example is here http://25thandClement.com/~william/projects/libdns.git

However, when it comes to file IO, we need an OS mechanism to deal with it or we are left to use a thread pool with blocking calls. We are currently using a threadpool for linux and unix, because the default implementation is not sufficient enough. If we look at linux AIO, there is still some concern that deep down the operations are blocking, as in the current implementation is not yet non blocking, because nobody implemented it in the right way.

@bminer
Copy link
Author

bminer commented Dec 3, 2014

@txdv - Node.js, for example, utilizes both C-Ares and getaddrinfo for DNS lookups. I (and the Node team) prefer the getaddrinfo route because it utilizes the OS resolver (and considers things like /etc/resolv.conf in Linux world). Deciding how to implement DNS is not the point here.

My point is that getaddrinfo is a blocking system call. So are file I/O operations. Therefore, we have the thread pool with blocking calls.

This brings me to my last comment... why not have separate thread pools? DNS/network and filesystem resources are two separate resources. Why not have their blocking I/O calls in separate thread pools? Or, perhaps, some hacky-hybrid solution of limiting the number of threads that a given resource-specific I/O request can use (i.e. getaddrinfo calls can only use up 2 of the 4 threads in the pool)?

I also agree that the alternatives (i.e. Linux AIO) are still concerning... and would require some significant work. What I propose about using separate thread pools should be relatively simple, right?

@bminer
Copy link
Author

bminer commented Dec 3, 2014

Maybe I should rename the title of this issue? :)

@txdv
Copy link
Contributor

txdv commented Dec 3, 2014

Having separate thread pools is already proposed:

#7

https://github.com/saghul/leps/blob/768747269f445f519ff75fd72f49f5661f29d5fb/XXX-threadpool-handle.md

Even lets you select in which thread pool you want to your specific request.

Furthermore, reading resolv.conf is not a hard part - glibc implements it simply ... by just reading it. There is no kernel level magic behind it.

But yeah, what you propose has already been thought through by @saghul

@bminer
Copy link
Author

bminer commented Dec 3, 2014

Glad to hear that it's already in the works... or at least that a proposal to fix the problem is in the works.

When will it land downstream in Node.js? (just kidding... sort of...)

@saghul
Copy link
Member

saghul commented Dec 4, 2014

Hi @bminer the threadpool has always been a bit of a problem. Your reasoning is sound, it's small and used for too many things. Some solutions were attempted in the past, but now I took a different approach in libuv/leps#4.

It's not yet final, but I hope we can land something following the model proposed there: have a handle which can do the work and push the responsibility to the user. I believe it's just not possible for us to cover all possible cases by trying to be smar in libuv, so we pass the burden to the user :-) You want a huge threadpool for file ops, fine! you want a different one for getaddrinfo / getnameinfo, cool too! So, I think everyone wins if we it right.

I'm going to close this, please join the discussion libuv/leps#4 if you have feedback on the proposal.

PS: Hopefully we also have a good story for serial devices sooner than later: #19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants