-
Notifications
You must be signed in to change notification settings - Fork 353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Circuit breaker #362
Circuit breaker #362
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few notes on docs.
The Host field indicates to which backend host should the current set of settings be applied. Leaving it empty | ||
indicates global settings. | ||
|
||
Command line name: host. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is glob supported (*
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* The answer is no, but mb it's a featue worth considering..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how would it work? currently if the host is not set, that means global settings (still with individual breaker instances for the hosts)
|
||
The window value sets the size of the sliding counter window of the failure rate breaker. | ||
|
||
Command line name: window. Possible command line values: any positive integer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any defaults worth mentioning here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are.. but currently they are used from sony/gobreaker and may be replaced in a later wave with https://github.com/ideahitme/circuit-breaker, so considered not declaring the defaults in the docs atm. Basically, 5 consecutive failures, 1 minute timeout, 1 half-open request. The default IdleTTL is documented.
|
||
The failures value sets the max failure count for both the "consecutive" and "rate" breakers. | ||
|
||
Command line name: failures. Possible command line values: any positive integer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any defaults worth mentioning here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
Related issues in gobreaker: |
sorry, didn't mean to delete those comments 🤕 |
👍 |
circuit/list.go
Outdated
// It can return the first consecutive items that match a | ||
// condition (in our case, the ones that were inactive for a | ||
// while). | ||
type list struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the first time ever since university, that I see a linked list implementation :-)
Anyway. Why not just use an array? This seems like premature optimisation and just adds fuckup-surface. Please consider just using an array. Yes, it might be slower, no it won't matter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i also been waiting long for the use case ;)
i think it is simply optimization, but not premature, since it is already in the PR. Using an array would be like not using anything, because then we could also just iterate over the timestamps to see which breaker is idle. The reason for the optimization here is that this is in a synchronized scope, happening on every request, and blocking every concurrent request. Why do you think that it won't matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now that thinking about it.. maybe there is a way to bind the breakers to each route. Will play with that as a separate effort.
} | ||
|
||
settings, _ := c.stateBag[circuitfilters.RouteSettingsKey].(circuit.BreakerSettings) | ||
settings.Host = c.outgoingHost |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I set breakers for <shunt>
or <loopback>
? Is that desirable behaviour? Or should we check and early escape in those cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we cannot set breakers for <shunt>
and <loopback>
. I think the "shunt", by its name, already breaks the circuit by itself. For the loopback, not sure, but i believe it is enough to see where it loops to.
👍 |
A general question, maybe more like a feature request: If a circuit breaker flips open, is it possible to purge the DNS cache for that host? Seems like a nice-to-have feature... |
@@ -515,13 +547,26 @@ func (p *Proxy) do(ctx *context) error { | |||
ctx.outgoingDebugRequest = debugReq | |||
ctx.setResponse(&http.Response{Header: make(http.Header)}, p.flags.PreserveOriginal()) | |||
} else { | |||
done, allow := p.checkBreaker(ctx) | |||
if !allow { | |||
return errCircuitBreakerOpen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I read this correct, that we will only log that a circuit breaker is open, but not which one? If a breaker flips open (or more general flips state), I would like to see a log line stating clearly, which breaker for which host goes from which old state into which new state.
Is this not there or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
valid point. It should be there.
Besides the logs, the plan is to add an endpoint to the "metrics" -> "support" listener that reports all the breaker states. After #364 gets finished.
purging the DNS cache. Will look into it. |
👍 |
👍 |
👍 |
OP:
In the simplest case, the proxy gets a breaker for the current backend host from the registry, if configured. The breakers are always bound to a backend host, even if they share their settings. There are two breaker types, one opens after N consecutive failures, the other opens when the failure rate crosses a threshold within a sliding window (numeric window, not time based). The idle breakers are automatically dropped from the registry when new breakers need to be created, e.g. when some backend hosts were removed from the configuration but new ones were added.
CONFIG:
DOCS:
https://github.com/zalando/skipper/blob/circuit-breaker/circuit/doc.go