Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce per service+partition concurrency limits #2760

Open
igalshilman opened this issue Feb 19, 2025 · 2 comments
Open

Introduce per service+partition concurrency limits #2760

igalshilman opened this issue Feb 19, 2025 · 2 comments

Comments

@igalshilman
Copy link
Contributor

we currently have an invoker concurrency limit that is used to defend the restate-server. we need a concurrency limit to defend a target service (or maybe an endpoint) .
While supporting a global concurrency limit is a bit more challenging, i suggest introducing a per-partition+target limit. Users can do their capacity planning accordingly, or re-route strict request to a specific key (hence pinning to a partition)

@slinkydeveloper
Copy link
Contributor

slinkydeveloper commented Feb 20, 2025

This might be easily served if we do #2432

@slinkydeveloper
Copy link
Contributor

After an offline conversation, we discussed the following 3 situations:

  • Protecting the runtime from overload/OOM. For this purpose we can use the current invoker concurrency limit, being it per partition, and this should be enough. We can also employ additional strategies such as this one: Rethink inactivity timeout under high load #2761
  • Protecting the service deployments/endpoints from the flood of invocations generated by the runtime. For this purpose, we can have a tunable per service deployment, that is implemented by the invoker and behaves exactly like the invoker concurrency limit, but on a service deployment basis. This limit would again be per partition, so the effective limit is the configured user value * num partitions (we can play on how to let the user best configure this value).
  • Granularly define a concurrency limit for Service handlers, or virtual object/workflow shared handlers. This is a semantic feature, that goes in the partition processor, and connects to the thread of concurrency limits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants