You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
we currently have an invoker concurrency limit that is used to defend the restate-server. we need a concurrency limit to defend a target service (or maybe an endpoint) .
While supporting a global concurrency limit is a bit more challenging, i suggest introducing a per-partition+target limit. Users can do their capacity planning accordingly, or re-route strict request to a specific key (hence pinning to a partition)
The text was updated successfully, but these errors were encountered:
After an offline conversation, we discussed the following 3 situations:
Protecting the runtime from overload/OOM. For this purpose we can use the current invoker concurrency limit, being it per partition, and this should be enough. We can also employ additional strategies such as this one: Rethink inactivity timeout under high load #2761
Protecting the service deployments/endpoints from the flood of invocations generated by the runtime. For this purpose, we can have a tunable per service deployment, that is implemented by the invoker and behaves exactly like the invoker concurrency limit, but on a service deployment basis. This limit would again be per partition, so the effective limit is the configured user value * num partitions (we can play on how to let the user best configure this value).
Granularly define a concurrency limit for Service handlers, or virtual object/workflow shared handlers. This is a semantic feature, that goes in the partition processor, and connects to the thread of concurrency limits.
we currently have an invoker concurrency limit that is used to defend the restate-server. we need a concurrency limit to defend a target service (or maybe an endpoint) .
While supporting a global concurrency limit is a bit more challenging, i suggest introducing a per-partition+target limit. Users can do their capacity planning accordingly, or re-route strict request to a specific key (hence pinning to a partition)
The text was updated successfully, but these errors were encountered: