-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(service): horizontal scaling #3178
Conversation
d20ca57
to
ec9c103
Compare
You can access the deployment of this PR at https://renku-ci-rp-3178.dev.renku.ch |
ec9c103
to
e070f6f
Compare
e070f6f
to
c722993
Compare
We have to set resource requests for all containers in the core service so that the pod autoscaler works. I prospose the following resource requests:
I looked at the historical memory consumption in Gi over the last 90 days from renkulab.io. |
What do these changes mean:
|
703e606
to
fddc9e1
Compare
This is how the routing changes: Current: flowchart LR
Browser
subgraph Ingress [Ingress]
IngressRenku[http://renkulab.io/ui-server/api/renku]
end
subgraph k8s[k8s cluster]
UI[UI-server]
subgraph Gateway
GatewayTraefik[Gateway traefik]
GatewayAuth[Gateway-auth]
end
subgraph CoreSvc[Core Service Pod]
Core
end
end
Browser -- 1 --> IngressRenku
IngressRenku -- 2 --> UI
UI -- 3 --> GatewayTraefik
GatewayTraefik -- 4 --> GatewayAuth
GatewayAuth -- 5 --> GatewayTraefik
GatewayTraefik -- 6 --> Core
New flowchart LR
Browser
subgraph Ingress [Ingress]
IngressRenku[http://renkulab.io/ui-server/api/renku]
IngressCore[http://renkulab.io/api/renku]
end
subgraph k8s[k8s cluster]
UI[UI-server]
subgraph Gateway
GatewayTraefik[Gateway traefik]
GatewayAuth[Gateway-auth]
end
subgraph CoreSvc[Core Service Pod]
Core
Traefik
end
end
Browser -- 1 --> IngressRenku
IngressRenku -- 2 --> UI
UI -- 3 --> GatewayTraefik
GatewayTraefik -- 4 --> IngressCore
IngressCore -- 5 --> Traefik
Traefik -- 6 --> GatewayAuth
GatewayAuth -- 7 --> Traefik
Traefik -- 8 --> Core
The gateway uses traefik to do the routing. And traefik cannot assign sticky session cookies. It only sees the address for the k8s service and the round robin load balancing the k8s service does is not known to traefik. But the k8s ingress does know what actual replica will the k8s service use and can assign the sticky session cookie. That is why we need to go through the ingress now to get the sticky sessions to work. And in the new version of the routing the core service's traefik container has to go to the gateway to get authenticated/exchange the JWT for any other token it needs. |
fddc9e1
to
5528dd4
Compare
5528dd4
to
5b2510d
Compare
5b2510d
to
750f156
Compare
Results from load testing: Migrations
File uploads
|
a06c32f
to
f8983d3
Compare
@Panaetius this is good to go. I cannot approve because I opened the PR in the first place. |
Does this require refreshing SwissDataScienceCenter/renku-ui#2134 ? |
no. While the versions list is not served by nginx anymore but the individual core svc, the content/URL shouldn't have change ( |
/deploy renku=core-svc-horizontal-scaling renku-gateway=core-sticky-sessions #persist