-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add limit for number of concurrent connections to registry #15569
Conversation
return true | ||
} | ||
|
||
func TestCoutner(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo
"time" | ||
) | ||
|
||
func defaultOverloadHandler(w http.ResponseWriter, r *http.Request) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be consistent with Kubernetes unless we have a reason not to. If we have a reason not to, it should be documented in comments here.
} | ||
|
||
// DefaultOverloadHandler is a default OverloadHandler that used by New. | ||
var DefaultOverloadHandler http.Handler = http.HandlerFunc(defaultOverloadHandler) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need to be public?
|
||
var timer *time.Timer | ||
var timeout <-chan time.Time | ||
if h.MaxWaitInQueue > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If MaxWaitInQueue is zero, why wouldn't we exit earlier in the function (i.e., before the h.queue enqueue above)?
|
||
func (h *Handler) ServeHTTP(w http.ResponseWriter, r *http.Request) { | ||
if h.enqueueRunning(r.Context()) { | ||
defer func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might be better as a non-closure. defer h.done()
@@ -137,7 +138,15 @@ func Execute(configFile io.Reader) { | |||
app.Config.HTTP.Headers.Set("X-Registry-Supports-Signatures", "1") | |||
|
|||
app.RegisterHealthChecks() | |||
handler := alive("/", app) | |||
handler := http.Handler(app) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The limiter is fine, but this needs to be much more discriminating. Blob HEAD and GET shouldn't be under the same rate limit. This is too broad to solve the existing problem without adding a new one.
This needs to only apply to blob upload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd be ok with single quota for a set of methods / paths that only includes uploads, as long as we can identify up front (before we merge) that it addresses the issue at hand. Future changes can well expand it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smarterclayton afaik HEAD is triggering mirroring of blobs that is equivalent to uploading. Why that should not be rate-limited?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should, but our read registry traffic is an order of magnitude than our write load. Setting a read rate limit is going to dwarf the write limit, which mean we'll have to set the read limit too low to preserve our current scale.
If pull though proxy needs to be limited, then it's possible we need another limiter wired in (or connect this limiter to that). However, if this only addresses non-pullthroigh it gets us closer to not broken. We aren't breaking today because of pullthrough though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, we already have a crude pullthrough limiter that limits max simultaneous writes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smarterclayton what I was saying is that HEAD will cause a write to blob store (we start mirroring and uploading to S3 in background, bypassing the API and rate-limiter).
@dmage or @legionus can explain it better, seems like i am a proxy here :) maybe you guys should talk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we already have a crude pullthrough limiter that limits max simultaneous writes.
It's better to think that we haven't, it has serious leaking problems (and it's not a pull-through limiter, but a storage writer limiter).
If pull though proxy needs to be limited
Pull-through can trigger mirroring which uses blob writers which consume memory. So... I don't know, maybe we needn't. Do we want to have the same limiter for mirroring and for PUT/PATCH requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any write (upload) should be part of the same rate limit pool. Otherwise we have to come up with heuristics and split the pool, which then creates more operational overhead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But right now mirroring is not the primary perf problem we have. Any fix we do should be able to be adapted to cover mirroring, but it's not required that we do that right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated this PR, it still has one TODO, but the main idea I think will stay the same: use two limiters, one for GET/HEAD requests, and another one for PATCH/PUT and mirroring.
/test end_to_end |
/retest |
A few minor comments, but looks ok. |
} | ||
|
||
select { | ||
case l.running <- struct{}{}: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be FIFO instead of random selection from queued requests? Otherwise I wouldn't call them queued requests but waiting requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current implementation of Go, it is FIFO (golang/go#11506). While it does not mentioned in the language specification, I don't want to increase complexity because of it. The authors of Go try to avoid problems with "tail latency during bursty load" and I'm fine with it even if it would not be genuine FIFO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmage very interesting. In Go Playground. It's behaving like FIFO. Locally, with version go version go1.8.3 linux/amd64
, I get totally different behavior:
go run bufchan.go
reader #0 got message from writer #1
reader #2 got message from writer #3
reader #4 got message from writer #2
reader #5 got message from writer #7
reader #6 got message from writer #4
reader #7 got message from writer #5
reader #8 got message from writer #8
reader #3 got message from writer #6
reader #1 got message from writer #0
reader #9 got message from writer #9
There are clearly conspicuous rising sequences but it's far from FIFO IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I agree, let's put the burden on the golang.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to use n
on line 17 and to increase sleep time, 10 us might be too low.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, now it looks like FIFO indeed 😉.
Signed-off-by: Oleg Bulatov <[email protected]>
/approve |
Yeah, channels are FIFO by design. They can't be changed without breaking
existing apps.
…On Mon, Aug 7, 2017 at 11:41 AM, Michal Minář ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pkg/dockerregistry/server/maxconnections/limiter.go
<#15569 (comment)>:
> + defer func() {
+ <-l.queue
+ }()
+ default:
+ return false
+ }
+
+ var timeout <-chan time.Time
+ if l.maxWaitInQueue > 0 {
+ timer := l.newTimer(l.maxWaitInQueue)
+ defer timer.Stop()
+ timeout = timer.C
+ }
+
+ select {
+ case l.running <- struct{}{}:
My bad, now it looks like FIFO indeed 😉.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15569 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p2OBy6ilgSWCa9fSQ8dI8X13dO85ks5sVzAdgaJpZM4OosXc>
.
|
/approve |
/assign kargakis |
/approve @miminar can you please add an OWNERS file inside pkg/cmd/dockerregistry with your names as approvers? Thanks. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dmage, kargakis, legionus, miminar The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/test all [submit-queue is verifying that this PR is safe to merge] |
Automatic merge from submit-queue |
How do we prepare to test this safely in our largest environments? We need
to have a plan for verifying this addresses the issue and can safely enable
it. Perhaps free int or free stg.
We also need a backport for 3.6.1 ose
On Aug 8, 2017, at 11:06 PM, OpenShift Merge Robot <[email protected]> wrote:
Automatic merge from submit-queue
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15569 (comment)>,
or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p22BKGeCCpOMI75mJPN1UIu5mb7zks5sWSI9gaJpZM4OosXc>
.
|
@smarterclayton We need monitoring data to specify the correct rate limits. The question is do we have statistics of the number of requests per time from these clusters. |
Yes, you got 8k builds in 20 minutes, with a rough average of 3 blobs per
image pushed. 2-5 minutes in was the highest peak, so assume double or
triple the average in that phase.
…On Wed, Aug 9, 2017 at 10:49 AM, Alexey Gladkov ***@***.***> wrote:
How do we prepare to test this safely in our largest environments? We need
to have a plan for verifying this addresses the issue and can safely enable
it. Perhaps free int or free stg.
@smarterclayton <https://github.com/smarterclayton> We need monitoring
data to specify the correct rate limits. The question is do we have
statistics of the number of requests per time from these clusters.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15569 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_pyP7lA3IOyhnWqAzq3fC8XVm3nUhks5sWcbxgaJpZM4OosXc>
.
|
Let's assume that we have 8000*3 equally distributed uploads in 5 minutes (very rough estimation, in practice I guess there will be a some kind of the Poisson distribution). 5 minutes between 8000*3 blobs = each 0.0125 second a new upload is started. Ok, let's assume that we have 1 GB and each upload uses 20 MB of RAM. So we can allow only 1 GB/20MB = 50 concurrent uploads and enqueue almost all requests. So our config is:
|
In the big clusters we have between 20 and 40 GB of slack RAM, but I'd
probably prefer to aim for 4-6GB on the largest clusters max. I'm actually
ok with even lower, since we know that EC2 -> S3 saturates (right now) at
around 80m/s. Recommendation is 64-128 simultaneous uploaders before
hitting saturation in most tests i've seen.
…On Wed, Aug 9, 2017 at 1:08 PM, Oleg Bulatov ***@***.***> wrote:
Let's assume that we have 8000*3 equally distributed uploads in 5 minutes
(very rough estimation, in practice I guess there will be a some kind of
the Poisson distribution).
5 minutes between 8000*3 blobs = each 0.0125 second a new upload is
started.
Let's assume that each upload takes 10 seconds. 10/0.0125 = 800 concurrent
uploads.
Ok, let's assume that we have 1 GB and each upload uses 20 MB of RAM.
800*20 MB = 16 GB. Oops.
So we can allow only 1 GB/20MB = 50 concurrent uploads and enqueue almost
all requests.
So our config is:
requests.write.maxrunning: 50
maxinqueue: 24000
maxwaitinqueue: 1200s
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15569 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABG_p_kCD7uqP8Xb2e4wb9C07T95RGYZks5sWeeRgaJpZM4OosXc>
.
|
You should think of the channel between EC2 and S3 if you care about tail latency and can scale up. Otherwise your data will be uploaded in a fixed amount of time no matter how it's ordered (the amount of data / egress bandwidth). If we care about tail latency, then we should set the queue size to zero (clients will be served quicker, but some of them will have to retry). If we don't care, we can decrease the amount of 429 by increasing latency. The value of maxrunning controls average bandwidth for an upload and from which point we have to scale up if we don't want to get 429 errors. This value depends on a server's capabilities. But in the end, no matter how the limiter is configured, the cluster will upload all the images in almost a fixed time*. maxrunning reduces peak memory usage, maxinqueue reduces overhead for retries. * if clients always try to repush on failure. |
The registry might have excessive resource usage under heavy load. To avoid this, we limit the number of concurrent requests. Requests over the MaxRunning limit are enqueued. Requests are rejected if there are MaxInQueue requests in the queue. Request may stay in the queue no more than MaxWaitInQueue.
See also #15448.