-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increased MaxIdleConnsPerHost to prevent excessive re-connections and… #5860
Conversation
… TIME_WAIT when more than 100 clients are using minio
Codecov Report
@@ Coverage Diff @@
## master #5860 +/- ##
==========================================
- Coverage 62.22% 60.06% -2.17%
==========================================
Files 129 212 +83
Lines 24168 30363 +6195
==========================================
+ Hits 15039 18238 +3199
- Misses 7725 10592 +2867
- Partials 1404 1533 +129
Continue to review full report at Codecov.
|
Mint Automation
5860-6f69d19/mint-gateway-s3.sh.log:
5860-6f69d19/mint-dist-xl.sh.log:
5860-6f69d19/mint-fs.sh.log:
5860-6f69d19/mint-gateway-azure.sh.log:
5860-6f69d19/mint-xl.sh.log:
|
you can simply set this to ulimit value |
The soft limit on open file descriptors is indeed the first problem you run into. Default Ubuntu sets this to 1024, but I've already increased it. The next limit is when Linux runs out of ephemeral ports to use because it creates too many short-lived TCP connections which linger in the TIME_WAIT state for about a minute. Once you run out of ports, no new connections can be opened. This is caused by a strange behavior of the HTTP connection pool in Go: if you have more open connections than MaxIdleConnsPerHost, the exceess connections will be closed immediately and are not available for further HTTP requests. Go then goes on to create new connections for the pending HTTP requests. It's a bit crazy that the HTTP client doesn't realize that there are more requests pending that could use this connection. See here in these links: |
We automatically set it maximum possible @cbenien you don't have to increase it manually. So by default up to 4096 is allowed for each user which is a big enough number for most networks and disks. |
@cbenien I was able to sustain 1000 concurrent GET to Azure gateway without any changes to minio source. Can you check again by just increasing ulimits for open files? |
I've set ulimit to maximum. Let me add a few more details to my setup. I'm running a single instance of minio on a pretty powerful Azure VM (DS15v2), which is located in the same region as the storage account. This provides low latency and lots of bandwidth between minio and the Azure blob storage. I'm using the following code as benchmark: https://github.com/wasabi-tech/s3-benchmark The object size is 100K, so I'm not only having 1000 concurrent requests, but also several thousand of them per second. If the Go HTTP client connection pool is leaking TIME_WAIT connections, this can quickly exhaust the available ephemeral ports. Here's an example output (where the referenced pull request is applied)
Monitor the outgoing connections with netstat while the benchmark runs. The inverted grep filters out incoming connections, and
You can also add a grep for TIME_WAIT to see how many connections are ESTABLISHED and how many are TIME_WAIT. Once the ports are exhausted, minio will show countless errors in its output:
I believe the root cause of this problem is in the Go HTTP client stack. It allows unbounded outgoing connections and doesn't do a good job reusing them. It's also not very efficient to have 1000 outgoing HTTP/1.1 connections to the same server. It would be much better to create 20-100 stable connections and pipeline all requests over them. The issue in the go repo for this is here: golang/go#6785 |
@cbenien thanks for the detailed info. |
- due to what appears to be a frequent issue with the Go HTTP client some tweaks were needed to the HTTP client used for reverse proxying to prevent CoreDNS from rejecting connections. The following PRs / commits implement similar changes in Prometheus and Minio. prometheus/prometheus#3592 minio/minio#5860 Under a 3-node (1-master) kubeadm cluster running on bare metal with Ubuntu 18.04 I was able to send 100k requests with 1000 being concurrent with no errors being returned by hey. ``` hey -n 100000 -c 1000 -m=POST -d="hi" \ http://192.168.0.26:31112/function/go-echo ``` The go-echo function is based upon the golang-http template in the function store using the of-watchdog. Signed-off-by: Alex Ellis (VMware) <[email protected]>
- due to what appears to be a frequent issue with the Go HTTP client some tweaks were needed to the HTTP client used for reverse proxying to prevent CoreDNS from rejecting connections. The following PRs / commits implement similar changes in Prometheus and Minio. prometheus/prometheus#3592 minio/minio#5860 Under a 3-node (1-master) kubeadm cluster running on bare metal with Ubuntu 18.04 I was able to send 100k requests with 1000 being concurrent with no errors being returned by hey. ``` hey -n 100000 -c 1000 -m=POST -d="hi" \ http://192.168.0.26:31112/function/go-echo ``` The go-echo function is based upon the golang-http template in the function store using the of-watchdog. Signed-off-by: Alex Ellis (VMware) <[email protected]>
- due to what appears to be a frequent issue with the Go HTTP client some tweaks were needed to the HTTP client used for reverse proxying to prevent CoreDNS from rejecting connections. The following PRs / commits implement similar changes in Prometheus and Minio. prometheus/prometheus#3592 minio/minio#5860 Under a 3-node (1-master) kubeadm cluster running on bare metal with Ubuntu 18.04 I was able to send 100k requests with 1000 being concurrent with no errors being returned by hey. ``` hey -n 100000 -c 1000 -m=POST -d="hi" \ http://192.168.0.26:31112/function/go-echo ``` The go-echo function is based upon the golang-http template in the function store using the of-watchdog. Signed-off-by: Alex Ellis (VMware) <[email protected]>
… TIME_WAIT when more than 100 clients are using minio
Description
See #5859
Motivation and Context
See #5859
This is just a proof of concept, and a dynamic solution is probably better than statically increasing these limits. Or maybe a different HTTP transport for each purpose?
How Has This Been Tested?
Load test can now survive 1000 threads in parallel
Types of changes
Checklist:
mint
PR # here: )