Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Your IP is issuing too many concurrent connections" with server UI behind proxy #15471

Closed
shoeffner opened this issue Dec 5, 2022 · 5 comments

Comments

@shoeffner
Copy link

For a few months (at least October, but probably earlier) we are routinely getting 429 Too Many Requests: Your IP is issuing too many concurrent connections, please rate limit your calls, especially when navigating the UI, which seems to be thrown by

nomad/command/agent/http.go

Lines 269 to 296 in ee2f3e4

// connLimiter returns a connection-limiter function with a rate-limited 429-response error handler.
// The rate-limit prevents the TLS handshake necessary to write the HTTP response
// from consuming too many server resources.
func connLimiter(connLimit int, logger log.Logger) func(conn net.Conn, state http.ConnState) {
// Global rate-limit of 10 responses per second with a 100-response burst.
limiter := rate.NewLimiter(10, 100)
tooManyConnsMsg := "Your IP is issuing too many concurrent connections, please rate limit your calls\n"
tooManyRequestsResponse := []byte(fmt.Sprintf("HTTP/1.1 429 Too Many Requests\r\n"+
"Content-Type: text/plain\r\n"+
"Content-Length: %d\r\n"+
"Connection: close\r\n\r\n%s", len(tooManyConnsMsg), tooManyConnsMsg))
return connlimit.NewLimiter(connlimit.Config{
MaxConnsPerClientIP: connLimit,
}).HTTPConnStateFuncWithErrorHandler(func(err error, conn net.Conn) {
if err == connlimit.ErrPerClientIPLimitReached {
metrics.IncrCounter([]string{"nomad", "agent", "http", "exceeded"}, 1)
if n := limiter.Reserve(); n.Delay() == 0 {
logger.Warn("Too many concurrent connections", "address", conn.RemoteAddr().String(), "limit", connLimit)
conn.SetDeadline(time.Now().Add(10 * time.Millisecond))
conn.Write(tooManyRequestsResponse)
} else {
n.Cancel()
}
}
conn.Close()
})
}
. From the related commit message ("Return 429 response on HTTP max connection limit. Instead of silently closing the connection [...]"), I guess that's a good sign as we now see that something is wrong.

However, this behavior (not the error, but the rate limiting) causes troubles with our setup: You can see from the logs all connections to our nomad come from 127.0.0.1, as we proxy the connections through Fabio. I assume that Nomad could handle way more connections, since the limit specifies "Your IP", and in our case, every call has the same IP:

Dec 05 11:20:17 cluster-server nomad[1491011]:     2022-12-05T11:20:17.108+0100 [WARN]  http: Too many concurrent connections: address=127.0.0.1:58620 limit=100

I found

HTTPMaxConnsPerClient *int `hcl:"http_max_conns_per_client"`
in the source, but not in the docs. Are those "public" settings I should use?
I am not even sure the settings are used by the code in question, although they seem to be set at

nomad/command/agent/http.go

Lines 281 to 283 in ee2f3e4

return connlimit.NewLimiter(connlimit.Config{
MaxConnsPerClientIP: connLimit,
}).HTTPConnStateFuncWithErrorHandler(func(err error, conn net.Conn) {

But the rate limiter has a hard-coded 100 a few lines above that.

How do you handle deployments behind a proxy? Or should we simply not deploy Nomad behind proxies? Or can Nomad use headers such as X-Forwarded-For, Forwarded, etc. to check the connections?

Nomad version

Output from nomad version

Nomad v1.4.2 (f0c64605666324e886377ab897085a015a10a58c+CHANGES)

(We have a custom patch for some mount options, hence the commit might not be accurate -- but this issue is very likely unrelated)

Operating system and Environment details

Ubuntu 20.04.5 LTS (GNU/Linux 5.4.0-131-generic x86_64)
We proxy the Nomad UI through fabio.

Issue

We get rate limited due to our proxy making too many requests to the Nomad server.

Reproduction steps

Deploy Nomad behind a proxy and fire up multiple connections to it, best from different IP addresses to see the impact.

Expected Result

Normal use of the UI should not come to a halt because many users are seen as the same user.

Actual Result

Rate limiting is shared among all users.

Job file (if appropriate)

n/a

Nomad Server logs (if appropriate)

Dec 05 11:20:17 cluster-server nomad[1491011]:     2022-12-05T11:20:17.108+0100 [WARN]  http: Too many concurrent connections: address=127.0.0.1:58620 limit=100

Nomad Client logs (if appropriate)

n/a

@ngcmac
Copy link

ngcmac commented Feb 2, 2023

Hi,

I'm also facing this issue using apache as a reverse proxy to Nomad ui and api.
Is there any chance that MaxConnsPerClient can be set through Nomad config?
Nomad v1.4.3

2023-02-02T17:03:35.072Z [WARN]  http: Too many concurrent connections: address=172.29.201.23:36932 limit=100

Thanks

@tgross tgross self-assigned this Feb 13, 2023
@tgross
Copy link
Member

tgross commented Feb 13, 2023

Hi @shoeffner! The configuration documentation for those limits can be found under limits, which doesn't have its own sidebar section for some reason. We should probably split that out. But for your environment that's exactly what you'll want to set. The hard-coded value you see in the code of 10 is the requests-per-second-per-IP (with a burst limit of 100), which is used to slow down requests but shouldn't fire errors.

So for your setup you'll want to set http_max_conns_per_client but I would also recommend setting rpc_max_conns_per_client. This is also something that the Task API socket we're shipping in Nomad 1.5.0 (#15864) is designed to help out with by ensuring the proxy task can reuse one connection.

That being said, you might also want to take a look at your Fabio configuration. I'm not super familiar with Fabio but I wouldn't expect a load balancer to open new connections to the upstream for every single incoming request.

@shoeffner
Copy link
Author

Hi @tgross, thanks for pointing me to the docs, I couldn't find it back then. That should close this issue.

We discussed all of this back and forth and decided that a better long-term solution is to remove fabio from the loop and use Consul DNS to directly point at the Nomad servers, which will then be able to properly rate limit the clients etc.
It will also be a much more resilient setup, as Nomad itself will no longer rely on fabio as a single point of failure.

But in the meantime, I will configure the http and rpc max connections, thank you very much!

We will update to 1.5.0 "soonish", we still need to evaluate the new SSO vs our own custom login solution (problems we faced with Vault as a token issuer are detailed in hashicorp/vault#16183). But we will certainly keep an eye out for the Task API.

@thefallentree
Copy link
Contributor

This is related to #19212

Copy link

github-actions bot commented Jan 3, 2025

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 3, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Development

No branches or pull requests

4 participants