Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CA: implement prioritization/rate limitting #334

Closed
howardjohn opened this issue Jan 12, 2023 · 4 comments
Closed

CA: implement prioritization/rate limitting #334

howardjohn opened this issue Jan 12, 2023 · 4 comments
Assignees

Comments

@howardjohn
Copy link
Member

Depends on #298

Currently we have 3 sources of CA calls:

  1. On demand: we have a request incoming and need a cert (right now!).
  2. Background refresh: we have a cert and its near expiration, we need to refresh. By default this is 24hr lifetime, start refresh at 12hr
  3. Prewarming. New workload (or ztunnel just started), we want to preload the cert to reduce latency on first call (avoid "cold starts")

In order of important, this likely looks like: On Demand >>>> Background refresh when really close to expiration > Prewarming == Background refresh. Could be simplified to just "on demand is top priority".

Additionally, CA requests are expensive. 255 concurrent requests to 1 istiod would almost certainly overwhelm it (in CPU cost); other CAs may have different constraints. We are also not the only client. We need a sensible strategy that trades off not killing the CA with getting all the certs when we want.

At the very least, we should have a way to prioritize on demand requests. This could be something simple like adding some delay in prewarming, or have some priority queue.

@bleggett
Copy link
Contributor

bleggett commented Feb 2, 2023

Dumb q - do we (or do we intend to) support the SDS protocol so that ztunnel could play nice with a compatible replacement workload CA/SDS, like the rest of Istio does today?

@howardjohn
Copy link
Member Author

I started #251 to define these integration interfaces. I hadn't put SDS in there but its plausible.

I will say in general the idea has been to have a few common integration points. For example, instead of 10 telemetry providers just use OTEL, which itself supports the kitchen sink of providers.

There is less of a standard around CAs compared to telemetry though.

@keithmattix
Copy link
Contributor

I've seen some semblance of this in the code; has this been implemented?

@howardjohn
Copy link
Member Author

yeah I think this is done. thanks!

yuval-k pushed a commit to yuval-k/ztunnel that referenced this issue Aug 22, 2023
* Allow mismatched ns/hostnames and pick randomly based on services for the dst workload

Signed-off-by: Kevin Dorosh <[email protected]>

* Fix lint

Signed-off-by: Kevin Dorosh <[email protected]>

* Fix lint, again

Signed-off-by: Kevin Dorosh <[email protected]>

* Update test to cover multi namespace multi network

Signed-off-by: Kevin Dorosh <[email protected]>

* Remove dead code

Signed-off-by: Kevin Dorosh <[email protected]>

---------

Signed-off-by: Kevin Dorosh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants