Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Web Rate Limits Configurable #42132

Open
smallinsky opened this issue May 29, 2024 · 6 comments
Open

Make Web Rate Limits Configurable #42132

smallinsky opened this issue May 29, 2024 · 6 comments
Labels
feature-request Used for new features in Teleport, improvements to current should be #enhancements

Comments

@smallinsky
Copy link
Contributor

smallinsky commented May 29, 2024

What would you like Teleport to do?

The #24623 introduced rate limiting with hardcoded limits for all unauthenticated endpoints. Some endpoints, such as ping/connection/update in the case of TLS Routing - ALB setup, are used very often.

Certain workflows, like custom script/Ansible playbooks, perform many API calls, quickly draining this limit and resulting in 429 HTTP errors without retry mechanism.

Currently, Web API limits are not configurable and hardcoded values are used:

Period: defaults.LimiterHighPeriod,

Rate Limiting values should be configurable by

Limits ConnectionLimits `yaml:"connection_limits,omitempty"`

What problem does this solve?

It allows handling custom intensive workloads without being hindered by rate limiting.

@smallinsky smallinsky added the feature-request Used for new features in Teleport, improvements to current should be #enhancements label May 29, 2024
@webvictim
Copy link
Contributor

Related: #34611

@milos-teleport
Copy link
Contributor

User impact

Users, especially those with Teleport assets behind a NAT or proxies not configured to pass the X-Forwarded-For HTTP header experience this as the following error (version 14.3.20):

Jan 01 00:00:00 teleport.example.com teleport[5555555]: 2024-01-01T00:00:00Z ERRO [SSH:PROXY] too many connections from 111.111.111.111: 15000, max is 15000 pid:5555555.1 sshutils/server.go:420

Source of line 420 in version 14.3.20 / 16.1.3 equivalent

Workarounds

In case of misconfigured reverse proxies or ingress(es) in Kubernetes deployments, the fix is to simply ensure that X-Forwarded-For HTTP header gets passed.

In case of user deployments where many assets are behind a NAT in relation to Teleport Proxy, a potential workaround is to use a TLS Terminating Reverse Proxy in front of Teleport, which caches /webapi/_ping and /webapi/_find. I am not sure whether this workaround could introduce other issues, though.

Another NAT workaround is to reconfigure so Teleport assets don't use the NAT, however, this is usually not a straightforward fix, and in many cases not possible for business or technical reasons.

@zmb3
Copy link
Collaborator

zmb3 commented Aug 7, 2024

We removed these rate limits in #42799, this should no longer be an issue.

@rafalr-ntropy
Copy link

rafalr-ntropy commented Aug 19, 2024

@zmb3 one of our team members noticed that #42799 was released as part of https://github.com/gravitational/teleport/releases/tag/v15.4.3. As I mentioned we still get HTTP 429 in some scenarios for 15.4.6 when there is a traffic increase, example logs from proxy component below:

2024-08-19T11:01:48Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 7.144981ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:01:49Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 24.127282ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:04:13Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 7.264448ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:06:11Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 15.177025ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:06:13Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 10.228534ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:06:15Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 14.90199ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:21:09Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 16.648579ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:22:06Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 29.755457ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:22:07Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 8.964458ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:22:38Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 13.810573ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:22:45Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 6.648519ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:22:46Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 23.211894ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:23:07Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 12.627802ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:23:08Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 10.193507ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:23:09Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 9.184715ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:23:41Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 14.876981ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:24:42Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 7.945361ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:39:32Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 10.106674ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:43:12Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 9.444735ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:51:05Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 16.41478ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223
2024-08-19T11:52:51Z INFO [APP:WEB]   Round trip: GET /api/admin-session, code: 429, duration: 13.262224ms tls:version: 304, tls:resume:false, tls:csuite:1301, tls:server:some-panel.my.domain reverseproxy/reverse_proxy.go:223

@fheinecke
Copy link
Contributor

@zmb3 I've been hitting this at home when some automation I have curls the webapi auth cert export endpoint. To reproduce, run this short script:

$ while ! curl -s -o /dev/null -w '%{http_code}' 'https://platform.teleport.sh/webapi/auth/export' | grep 429; do
    printf '.'
done
.............................429

Would you be opposed to me removing this limit here? This should be a pretty easy change. If you're okay with it, I can file a PR next week.

@zmb3
Copy link
Collaborator

zmb3 commented Mar 2, 2025

@fheinecke why do you need to hit the export endpoint so many times in such a short timespan?

I think all unauthenticated endpoints which do any non-trivial amount of work should continue to be rate limited to prevent DoS, but we can discuss changing the limits if the current limits are preventing legitimate use cases from functioning properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Used for new features in Teleport, improvements to current should be #enhancements
Projects
None yet
Development

No branches or pull requests

6 participants