Increased MaxIdleConnsPerHost to prevent excessive re-connections and… #5860

cbenien · 2018-04-26T13:36:47Z

… TIME_WAIT when more than 100 clients are using minio

Description

Motivation and Context

This is just a proof of concept, and a dynamic solution is probably better than statically increasing these limits. Or maybe a different HTTP transport for each purpose?

How Has This Been Tested?

Load test can now survive 1000 threads in parallel

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My code follows the code style of this project.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have added unit tests to cover my changes.
I have added/updated functional tests in mint. (If yes, add mint PR # here: )
All new and existing tests passed.

… TIME_WAIT when more than 100 clients are using minio

codecov · 2018-04-26T13:50:32Z

Codecov Report

Merging #5860 into master will decrease coverage by 2.16%.
The diff coverage is 0%.

@@            Coverage Diff             @@
##           master    #5860      +/-   ##
==========================================
- Coverage   62.22%   60.06%   -2.17%     
==========================================
  Files         129      212      +83     
  Lines       24168    30363    +6195     
==========================================
+ Hits        15039    18238    +3199     
- Misses       7725    10592    +2867     
- Partials     1404     1533     +129

Impacted Files	Coverage Δ
cmd/utils.go	`61.65% <0%> (ø)`	⬆️
pkg/event/rules.go	`100% <0%> (ø)`
cmd/http/listener.go	`85.41% <0%> (ø)`
pkg/bpool/bpool.go	`100% <0%> (ø)`
cmd/gateway/b2/gateway-b2.go	`16.91% <0%> (ø)`
cmd/gateway/azure/gateway-azure.go	`26.03% <0%> (ø)`
pkg/madmin/config-commands.go	`0% <0%> (ø)`
pkg/event/errors.go	`0% <0%> (ø)`
pkg/event/rulesmap.go	`100% <0%> (ø)`
pkg/hash/errors.go	`100% <0%> (ø)`
... and 92 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0dc3d7a...6f69d19. Read the comment docs.

balamurugana · 2018-04-26T13:57:20Z

Mint Automation

Test	Result
mint-xl.sh	❌ more...
mint-fs.sh	❌ more...
mint-dist-xl.sh	❌ more...
mint-gateway-s3.sh	❌ more...
mint-gateway-azure.sh	❌ more...

5860-6f69d19/mint-gateway-s3.sh.log:

Running with
SERVER_ENDPOINT: minikube:30974
ACCESS_KEY:      AKIAJH7GQKYNU3C2ADXA
SECRET_KEY:      ***REDACTED***
ENABLE_HTTPS:    0
SERVER_REGION:   us-east-1
MINT_DATA_DIR:   /mint/data
MINT_MODE:       core

To get logs, run 'docker cp 48fb66101465:/mint/log /tmp/mint-logs'
(1/13) Running aws-sdk-go tests ... done in 10 seconds
(2/13) Running aws-sdk-java tests ... done in 21 seconds
(3/13) Running aws-sdk-php tests ... done in 1 minutes and 18 seconds
(4/13) Running aws-sdk-ruby tests ... done in 1 minutes and 48 seconds
(5/13) Running awscli tests ... done in 6 minutes and 23 seconds
(6/13) Running mc tests ... done in 5 minutes and 19 seconds
(7/13) Running minio-dotnet tests ... done in 1 minutes and 8 seconds
(8/13) Running minio-go tests ... FAILED in 2 seconds
{
  "args": {
    "bucketName": "minio-go-test-plhu4kstx221vv9r"
  },
  "duration": 1889,
  "function": "GetBucketPolicy(bucketName)",
  "message": "policy should be set to readonly",
  "name": "minio-go: testFunctional",
  "status": "FAIL"
}

Executed 7 out of 13 tests successfully.

5860-6f69d19/mint-dist-xl.sh.log:

Running with
SERVER_ENDPOINT: minikube:30061
ACCESS_KEY:      minio
SECRET_KEY:      ***REDACTED***
ENABLE_HTTPS:    0
SERVER_REGION:   us-east-1
MINT_DATA_DIR:   /mint/data
MINT_MODE:       core

To get logs, run 'docker cp d92048c989b6:/mint/log /tmp/mint-logs'
(1/13) Running aws-sdk-go tests ... done in 13 seconds
(2/13) Running aws-sdk-java tests ... done in 18 seconds
(3/13) Running aws-sdk-php tests ... done in 52 seconds
(4/13) Running aws-sdk-ruby tests ... done in 1 minutes and 11 seconds
(5/13) Running awscli tests ... done in 6 minutes and 58 seconds
(6/13) Running mc tests ... done in 2 minutes and 17 seconds
(7/13) Running minio-dotnet tests ... done in 33 seconds
(8/13) Running minio-go tests ... FAILED in 1 seconds
{
  "args": {
    "bucketName": "minio-go-test-dh63xa94hixxrgtl"
  },
  "duration": 321,
  "function": "GetBucketPolicy(bucketName)",
  "message": "policy should be set to readonly",
  "name": "minio-go: testFunctional",
  "status": "FAIL"
}

Executed 7 out of 13 tests successfully.

5860-6f69d19/mint-fs.sh.log:

Running with
SERVER_ENDPOINT: minikube:32268
ACCESS_KEY:      minio
SECRET_KEY:      ***REDACTED***
ENABLE_HTTPS:    0
SERVER_REGION:   us-east-1
MINT_DATA_DIR:   /mint/data
MINT_MODE:       core

To get logs, run 'docker cp 02096718b8aa:/mint/log /tmp/mint-logs'
(1/13) Running aws-sdk-go tests ... done in 1 seconds
(2/13) Running aws-sdk-java tests ... done in 20 seconds
(3/13) Running aws-sdk-php tests ... done in 50 seconds
(4/13) Running aws-sdk-ruby tests ... done in 13 seconds
(5/13) Running awscli tests ... done in 6 minutes and 5 seconds
(6/13) Running mc tests ... done in 2 minutes and 32 seconds
(7/13) Running minio-dotnet tests ... done in 33 seconds
(8/13) Running minio-go tests ... FAILED in 0 seconds
{
  "args": {
    "bucketName": "minio-go-test-5gd5qyaxf96quwvq"
  },
  "duration": 125,
  "function": "GetBucketPolicy(bucketName)",
  "message": "policy should be set to readonly",
  "name": "minio-go: testFunctional",
  "status": "FAIL"
}

Executed 7 out of 13 tests successfully.

5860-6f69d19/mint-gateway-azure.sh.log:

Running with
SERVER_ENDPOINT: minikube:30544
ACCESS_KEY:      minioazure1
SECRET_KEY:      ***REDACTED***
ENABLE_HTTPS:    0
SERVER_REGION:   us-east-1
MINT_DATA_DIR:   /mint/data
MINT_MODE:       core

To get logs, run 'docker cp e0d7254181dc:/mint/log /tmp/mint-logs'
(1/13) Running aws-sdk-go tests ... done in 6 seconds
(2/13) Running aws-sdk-java tests ... done in 21 seconds
(3/13) Running aws-sdk-php tests ... done in 21 seconds
(4/13) Running aws-sdk-ruby tests ... done in 31 seconds
(5/13) Running awscli tests ... done in 5 minutes and 24 seconds
(6/13) Running mc tests ... done in 7 minutes and 40 seconds
(7/13) Running minio-dotnet tests ... done in 1 minutes and 9 seconds
(8/13) Running minio-go tests ... FAILED in 1 seconds
{
  "args": {
    "bucketName": "minio-go-test-0hzwfyn49vfitxnl",
    "bucketPolicy": "{\"Version\": \"2012-10-17\",\"Statement\": [{\"Action\": [\"s3:ListBucketMultipartUploads,s3:ListBucket\"],\"Effect\": \"Allow\",\"Principal\": {\"AWS\": [\"*\"]},\"Resource\": [\"arn:aws:s3:::minio-go-test-0hzwfyn49vfitxnl/*\"],\"Sid\": \"\"}]}"
  },
  "duration": 266,
  "error": "Policy has invalid resource.",
  "function": "SetBucketPolicy(bucketName, bucketPolicy)",
  "message": "SetBucketPolicy failed",
  "name": "minio-go: testFunctionalV2",
  "status": "FAIL"
}

Executed 7 out of 13 tests successfully.

5860-6f69d19/mint-xl.sh.log:

Running with
SERVER_ENDPOINT: minikube:32093
ACCESS_KEY:      minio
SECRET_KEY:      ***REDACTED***
ENABLE_HTTPS:    0
SERVER_REGION:   us-east-1
MINT_DATA_DIR:   /mint/data
MINT_MODE:       core

To get logs, run 'docker cp c952429ebe4d:/mint/log /tmp/mint-logs'
(1/13) Running aws-sdk-go tests ... done in 2 seconds
(2/13) Running aws-sdk-java tests ... done in 17 seconds
(3/13) Running aws-sdk-php tests ... done in 51 seconds
(4/13) Running aws-sdk-ruby tests ... done in 19 seconds
(5/13) Running awscli tests ... done in 6 minutes and 50 seconds
(6/13) Running mc tests ... done in 2 minutes and 45 seconds
(7/13) Running minio-dotnet tests ... done in 32 seconds
(8/13) Running minio-go tests ... FAILED in 0 seconds
{
  "args": {
    "bucketName": "minio-go-test-4k6pruifzth4c14k"
  },
  "duration": 114,
  "function": "GetBucketPolicy(bucketName)",
  "message": "policy should be set to readonly",
  "name": "minio-go: testFunctional",
  "status": "FAIL"
}

Executed 7 out of 13 tests successfully.

harshavardhana · 2018-04-26T17:38:10Z

Load test can now survive 1000 threads in parallel

you can simply set this to ulimit value GetMaxOpenFileLimit

cbenien · 2018-04-26T20:19:25Z

The soft limit on open file descriptors is indeed the first problem you run into. Default Ubuntu sets this to 1024, but I've already increased it. The next limit is when Linux runs out of ephemeral ports to use because it creates too many short-lived TCP connections which linger in the TIME_WAIT state for about a minute. Once you run out of ports, no new connections can be opened.

This is caused by a strange behavior of the HTTP connection pool in Go: if you have more open connections than MaxIdleConnsPerHost, the exceess connections will be closed immediately and are not available for further HTTP requests. Go then goes on to create new connections for the pending HTTP requests. It's a bit crazy that the HTTP client doesn't realize that there are more requests pending that could use this connection.

See here in these links:
http://tleyden.github.io/blog/2016/11/21/tuning-the-go-http-client-library-for-load-testing/
golang/go#13801
golang/go#8536

harshavardhana · 2018-04-26T20:41:27Z

The soft limit on open file descriptors is indeed the first problem you run into. Default Ubuntu sets this to 1024, but I've already increased it. The next limit is when Linux runs out of ephemeral ports to use because it creates too many short-lived TCP connections which linger in the TIME_WAIT state for about a minute. Once you run out of ports, no new connections can be opened.

We automatically set it maximum possible @cbenien you don't have to increase it manually. So by default up to 4096 is allowed for each user which is a big enough number for most networks and disks.

krishnasrinivas · 2018-04-27T00:32:00Z

@cbenien I was able to sustain 1000 concurrent GET to Azure gateway without any changes to minio source. Can you check again by just increasing ulimits for open files?

cbenien · 2018-04-28T13:42:16Z

I've set ulimit to maximum. Let me add a few more details to my setup. I'm running a single instance of minio on a pretty powerful Azure VM (DS15v2), which is located in the same region as the storage account. This provides low latency and lots of bandwidth between minio and the Azure blob storage.

I'm using the following code as benchmark: https://github.com/wasabi-tech/s3-benchmark
I had to make some tiny changes, e.g. I'm not creating the bucket nor do I try to delete all objects, because that uses an S3 API that Minio doesn't support.

The object size is 100K, so I'm not only having 1000 concurrent requests, but also several thousand of them per second. If the Go HTTP client connection pool is leaking TIME_WAIT connections, this can quickly exhaust the available ephemeral ports.

Here's an example output (where the referenced pull request is applied)

Wasabi benchmark program v2.0
Parameters: url=http://localhost:9000, bucket=benchmark, duration=60, threads=512, loops=1, size=100K
Loop 1: PUT time 61.5 secs, objects = 521390, speed = 828MB/sec, 8478.9 operations/sec.
Loop 1: GET time 60.5 secs, objects = 423639, speed = 683.8MB/sec, 7001.7 operations/sec.
Loop 1: DELETE time 42.9 secs, 12165.9 deletes/sec.
Benchmark completed.

Monitor the outgoing connections with netstat while the benchmark runs. The inverted grep filters out incoming connections, and wc -l counts the resulting lines:

netstat -tn | grep -v :9000 | wc -l

You can also add a grep for TIME_WAIT to see how many connections are ESTABLISHED and how many are TIME_WAIT.

Once the ports are exhausted, minio will show countless errors in its output:

API: PutObject(bucket=benchmark, object=4dc96efb80f96decea5ad3bb35534b999518954e)
Time: 13:37:37 UTC 04/28/2018
RemoteHost: 172.17.0.2:42694
UserAgent: Go-http-client/1.1
Error: Head https://***********.blob.core.windows.net/benchmark/4dc96efb80f96decea5ad3bb35534b999518954e: dial tcp 52.239.142.164:443: connect: cannot assign requested address
       1: cmd/gateway/azure/gateway-azure.go:672:azure.(*azureObjects).GetObjectInfo()
       2: cmd/gateway/azure/gateway-azure.go:701:azure.(*azureObjects).PutObject()
       3: cmd/object-handlers.go:667:cmd.(ObjectLayer).PutObject-fm()
       4: cmd/object-handlers.go:744:cmd.objectAPIHandlers.PutObjectHandler()
       5: cmd/api-router.go:77:cmd.(objectAPIHandlers).PutObjectHandler-fm()
       6: net/http/server.go:1947:http.HandlerFunc.ServeHTTP()

I believe the root cause of this problem is in the Go HTTP client stack. It allows unbounded outgoing connections and doesn't do a good job reusing them. It's also not very efficient to have 1000 outgoing HTTP/1.1 connections to the same server. It would be much better to create 20-100 stable connections and pipeline all requests over them. The issue in the go repo for this is here: golang/go#6785

krishnasrinivas · 2018-04-30T17:23:13Z

@cbenien thanks for the detailed info.

- due to what appears to be a frequent issue with the Go HTTP client some tweaks were needed to the HTTP client used for reverse proxying to prevent CoreDNS from rejecting connections. The following PRs / commits implement similar changes in Prometheus and Minio. prometheus/prometheus#3592 minio/minio#5860 Under a 3-node (1-master) kubeadm cluster running on bare metal with Ubuntu 18.04 I was able to send 100k requests with 1000 being concurrent with no errors being returned by hey. ``` hey -n 100000 -c 1000 -m=POST -d="hi" \ http://192.168.0.26:31112/function/go-echo ``` The go-echo function is based upon the golang-http template in the function store using the of-watchdog. Signed-off-by: Alex Ellis (VMware) <[email protected]>

Increased MaxIdleConnsPerHost to prevent excessive re-connections and…

6f69d19

… TIME_WAIT when more than 100 clients are using minio

nitisht requested review from harshavardhana and donatello and removed request for harshavardhana April 26, 2018 14:39

deekoder requested a review from harshavardhana April 26, 2018 18:52

krishnasrinivas approved these changes Apr 30, 2018

View reviewed changes

harshavardhana approved these changes Apr 30, 2018

View reviewed changes

deekoder removed the request for review from donatello April 30, 2018 18:54

deekoder merged commit e43cd0d into minio:master Apr 30, 2018

krishnasrinivas mentioned this pull request May 1, 2018

Scaling issues in gateway mode #5859

Closed

alexellis mentioned this pull request Jan 18, 2019

Tune HTTP client for concurrency openfaas/faas#1039

Merged

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increased MaxIdleConnsPerHost to prevent excessive re-connections and… #5860

Increased MaxIdleConnsPerHost to prevent excessive re-connections and… #5860

cbenien commented Apr 26, 2018

codecov bot commented Apr 26, 2018 •

edited

Loading

balamurugana commented Apr 26, 2018

harshavardhana commented Apr 26, 2018

cbenien commented Apr 26, 2018

harshavardhana commented Apr 26, 2018

krishnasrinivas commented Apr 27, 2018

cbenien commented Apr 28, 2018 •

edited

Loading

krishnasrinivas commented Apr 30, 2018

Increased MaxIdleConnsPerHost to prevent excessive re-connections and… #5860

Increased MaxIdleConnsPerHost to prevent excessive re-connections and… #5860

Conversation

cbenien commented Apr 26, 2018

Description

Motivation and Context

How Has This Been Tested?

Types of changes

Checklist:

codecov bot commented Apr 26, 2018 • edited Loading

Codecov Report

balamurugana commented Apr 26, 2018

Mint Automation

5860-6f69d19/mint-gateway-s3.sh.log:

5860-6f69d19/mint-dist-xl.sh.log:

5860-6f69d19/mint-fs.sh.log:

5860-6f69d19/mint-gateway-azure.sh.log:

5860-6f69d19/mint-xl.sh.log:

harshavardhana commented Apr 26, 2018

cbenien commented Apr 26, 2018

harshavardhana commented Apr 26, 2018

krishnasrinivas commented Apr 27, 2018

cbenien commented Apr 28, 2018 • edited Loading

krishnasrinivas commented Apr 30, 2018

codecov bot commented Apr 26, 2018 •

edited

Loading

cbenien commented Apr 28, 2018 •

edited

Loading