-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GCE ingress improvements and documentation extension on github and profile #1136
Conversation
Thank you @hnykda ❤️! It seems great to me so far, I will try find time to review this properly but I'm swamped right now. One question that lingers in my mind: is it possible to reference a servicePort by name rather than by its actual port number from the ingress? I'll read up on this. |
Good question. I have never seen it in any examples, reference API isn't really exhaustive on this (there is no type specified). So don't know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
Unfortunately, I haven't been able to get it to work. When I deploy it, I get an error attempting to access with http:
Error: Server Error
The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.
and connection closed
when attempting to access with https.
Any idea what might be wrong?
I used your config exactly as-is with my hostname and added my static ip for the ingress. Unfortunately, debugging GCE ingress is a lot harder than nginx, since there don't appear to be any logs to look at.
ingress: | ||
enabled: true | ||
annotations: | ||
# kubernetes.io/ingress.global-static-ip-name: YOUR-STATIC-IP-NAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this static ip annotation required? If so, it might make sense to mention the steps required to get it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is required, you should get some automatically assigned, don't you? (I tested that only with my reserved address though).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, the YOUR-STATIC-IP-NAME
refer to the name of a reserved ip address rather than the actual IP?
Btw, to reserve IPs on GCP, see: https://cloud.google.com/compute/docs/ip-addresses/reserve-static-external-ip-address
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly.
This could be omitted (hence the comment) but I guess that it is quite common usecase that people want to expose it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah - let's find a place to link to this for more details:
https://cloud.google.com/kubernetes-engine/docs/tutorials/http-balancer#step_5_optional_configuring_a_static_ip_address
Thanks for testing! Did you wait at least ~5 minutes? It takes some time, I was also surprised at how long it takes. The GCP load balancers are lazy. |
hey all - friendly ping on this one...@minrk or @consideRatio did waiting a bit improve the original problems getting this to work? |
Hey here 👋 . Is there anything I can do to get this merged? This with #1112 is all we need to get merged to use this deployment on GCP without issues. Right now, we must vendorize the charts :-( . |
- example.com | ||
secretName: jupyterhub-tls | ||
``` | ||
*Note*: Ingress must have a default backend, it can be equivalent to the default jupyterhub path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GCE Ingress Controller may require this, but I don't require this on my setup with the ingress-nginx-controller.
The key questions about this PR lies here I think.
- In what way does the GCE ingress controller fail if we lack this default backend?
- How did you learn about 1? Is there documentation about this?
- Will adding a default backend influence the function of the ingress-nginx controller?
I'll investigate further, right now, about what I can learn about these things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the text I've read, I don't think we need a default backend, and if we do it seems to me to be a technical hack only that I'd like to understand properly.
Adding a default backend should probably be a 404 backend so myhub.com/lkasjdflajksdf goes to 404 rather than the hub etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do I need a default backend?
All GCE URL maps require at least one default backend, which handles all requests that don't match a host/path. In Ingress, the default backend is optional, since the resource is cross-platform and not all platforms require a default backend. If you don't specify one in your yaml, the GCE ingress controller will inject the default-http-backend Service that runs in the kube-system namespace as the default backend for the GCE HTTP lb allocated for that Ingress resource.
Some caveats concerning the default backend:
- It is the only Backend Service that doesn't directly map to a user specified NodePort Service
- It's created when the first Ingress is created, and deleted when the last Ingress is deleted, since we don't want to waste quota if the user is not going to need L7 loadbalancing through Ingress
- It has a HTTP health check pointing at /healthz, not the default /, because / serves a 404 by design
NOTE: This text came from here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be useful:
gcloud compute http-health-checks list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I configure GCE health checks through the Ingress?
Currently health checks are not exposed through the Ingress resource, they're handled at the node level by Kubernetes daemons (kube-proxy and the kubelet). However the GCE L7 lb still requires a HTTP(S) health check to measure node health. By default, this health check points at / on the nodePort associated with a given backend. Note that the purpose of this health check is NOT to determine when endpoint pods are overloaded, but rather, to detect when a given node is incapable of proxying requests for the Service:nodePort altogether. Overloaded endpoints are removed from the working set of a Service via readiness probes conducted by the kubelet.
If / doesn't work for your application, you can have the Ingress controller program the GCE health check to point at a readiness probe as shows in this example.
We plan to surface health checks through the API soon.
Is this section relevant to us?
However the GCE L7 lb still requires a HTTP(S) health check to measure node health. By default, this health check points at / on the nodePort associated with a given backend.
NOTE: This text came from here, and had a broken link that pointed to deleted documentation because it was supposedly covered someplace in GCP now, based on the commit message deleting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I read correctly, the ingress controller hardcoding of /healthz
is for the ingress's default backend, which isn't relevant to the Hub's health check endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm but the default backend will not respond if you have an ingress rule, right?
Hmmm.... This warps my brain.
kubernetes.io/ingress.allow-http: "false" | ||
hosts: | ||
- example.com # change it to you domain | ||
pathSuffix: "*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I learned:
zero-to-jupyterhub-k8s/jupyterhub/templates/ingress.yaml
Lines 14 to 24 in 03127a9
spec: | |
rules: | |
{{- range $host := .Values.ingress.hosts }} | |
- host: {{ $host | quote }} | |
http: | |
paths: | |
- path: {{ $.Values.hub.baseUrl }}{{ $.Values.ingress.pathSuffix }} | |
backend: | |
serviceName: proxy-public | |
servicePort: 80 | |
{{- end }} |
So, this will influence the rule of what to match, by saying: match everything. Hmmm, this is the default behavior for ingress-nginx I believe. I mean, I arrive at the destination even though i request https://myhubdomain.com/hub/home
for example. Is this another GCE Ingress Controller vs nginx-ingress-controller deviation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they behaves differently. See my following pr: #988
Review summary@hnykda thank you so much for your work on this! I see the value of having clear instructions and ability to use the GCE ingress controller. I tried reading up to understand more details about using the GCE Ingress Controller to act on the Kubernetes Ingress resources, which are pointless without a controller to consider them. I've only used the z2jh helm chart's ingress resource along with the nginx-ingress controller and want this PR to make everything compatible with that as well. The key questions I ended up with are:
|
#### GCE Ingress and manual HTTPS | ||
To set up HTTPS with GCE Ingress, you must: | ||
1. offload the proxy, enable `https` (and ideally, change it from | ||
external `LoadBalancer` to close it from outside world - we have ingress for that) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are suggesting a change from LoadBalancer
to NodePort
. Do we need NodePort
over ClusterIP
?
Yikes it becomes quite complicated to figure things out with so many parts in play.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it is: kubernetes/kubernetes#26508 (comment), I didn't try that myself though.
Thanks for the great review, I totally understand it's a bit rough and may need more love. The differences between nginx-controller and gce-controller have been driving us crazy in different projects, not just here. GCE controller has sometimes weird behaviour and often lacks some basic features such as configurable healthchecks or http -> https redirect 🙄 ... I am feeling dumb, but I can't reliably answer your questions. Not that I didn't try, but I couldn't find out why that's happening. @dn0 any chance you may help us out here? I am happy to experiment to test some hypothesis, how could we scope it down? P.S. To make our deployment websocket friendly, we added
but that's probably for another PR |
@hnykda right now we both fail to answer but I'll try choose not to feel dumb about it ;D This requires plenty of knowledge about software that we have not developed (the gce ingress controller for example), lots of networking knowledge, kubernetes knowledge, helm knowledge, even nginx-ingress-controller as we want things to work for that as well, perhaps TLS termination stuff, this specific helm chart, etc... This is like biology if physics would be something like understanding of how a routing rule works! If we could manage to not need to change the default backend that would be awesome, then we could merge this right away, if we do need to change the default backend I feel I'd like more understanding about what goes on and what that could entail for all the deployments out there that would be affected by the change. I think this resource was quite useful for me to learn more about the usage of the GCE ingress controller, and from there as you said they are suggesting the use of NodePort over ClusterIP services btw! |
Nice description 😄 . Ok! Let me try not to change the default backend and see what happens! 😄 |
So I tried. I removed that default backend from the ingress template, deleted the ingress resource from the cluster and redeployed.
These are the healthchecks and LBs: If I change the default backend healthcheck which is the failing one (because it tries the default If I explicitly specify default backend as in the template, it creates only one backend (and therefore healthcheck) corresponding to FYI, it really takes ages until ingress picks up even after the backend is showing 3/3 instances healthy. It's necessary to wait at least 5-10 minutes 🙄 . That's explaining the @minrk post here imo. All you get till then is 502 timeout. Anything else I could do? |
Ah! Excellent investigation! Then I think what we should do is:
|
Cool. Happy to continue, as you know, #1004 is something I am interested in as well... By "make this PR to document" you mean extending the documentation somehow? Happy to do that, but could you be a little bit more specific? |
@hnykda oh I meant kinda what this PR already is, but add some notes about what we have picked up. Attempted summary of my current understanding
|
OK. Can I just wait until we have the |
@hnykda sounds great! |
Any chance someone could ping me when it's done or is there a way how I can find myself? |
@hnykda I'll ping you! Oh, and you could also subscribe to the issues and PRs somehow perhaps. |
@hnykda You can use a check similar to this: https://github.com/helm/charts/blob/master/stable/drone/templates/ingress.yaml#L24 to make the ingress more universal. |
kubernetes.io/ingress.class: gce | ||
kubernetes.io/ingress.allow-http: "false" | ||
hosts: | ||
- example.com # change it to you domain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nits: typo you
→ your
To set up HTTPS with GCE Ingress, you must: | ||
1. offload the proxy, enable `https` (and ideally, change it from | ||
external `LoadBalancer` to close it from outside world - we have ingress for that) | ||
2. create TLS secrets same as for other Ingresses (e.g. [nginx](https://kubernetes.github.io/ingress-nginx/user-guide/tls/) ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, it's better to note that you can use Google manged certificates or pre-shared certs (not k8s resource) instead of k8s secrets.
I updated my config.yaml following the instructions https://github.com/jupyterhub/zero-to-jupyterhub-k8s/commit/331f1cbf3045b837481fe50ccd588e4e919c3e74?diff=unified. I didn't put the my static ip name by the way. However, once I updated the config.yaml, I couldn't open the jupyterhub page (I waited for more than 5 minutes). When I opened the page on Safari, it showed "Safari can't open the page because the server where this page is located isn't responding". Can anyone help me out on that? |
|
||
#### GCE Ingress and manual HTTPS | ||
To set up HTTPS with GCE Ingress, you must: | ||
1. offload the proxy, enable `https` (and ideally, change it from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to enable https, it's handled at Ingress rather than proxy
.
By setting enabled: true, type: offload
, the proxy
service
- opens port 443, but ingress never send request to it
- is expected to terminate SSL, but GKE service has no such feature
so it's useless.
Click here to see how `proxy` service works for each https type
Legend: [*]
is where SSL termination happens.
enabled: false
svc: proxy pod: proxy
┌────────────┐ ┌────────────┐
─ 80 ──────────┼───────────────────────────────────── 8000 │
│ │ │ │
└────────────┘ └────────────┘
enabled: true, type: manual/secret
svc: proxy pod: proxy
┌────────────┐ ┌────────────┐
─ 80 ──────────┼───────────────────────────────────── 8000 │
─ 443 ─────────┼───────────────────────────────────── 8443─[*]→ redirect
└────────────┘ └────────────┘
enabled: true, type: offload
svc: proxy pod: proxy
┌────────────┐ ┌────────────┐
─ 80 ───────┬──┼───────────────────────────────────── 8000 │
─ 443 ──[*]─┘ │ │ │
└────────────┘ └────────────┘
enabled: true, type: auto
svc: proxy pod: proxy
┌────────────┐ ┌────────────┐
─ 80 ──────────┼─┐ ┌ 8000 │
─ 443 ─────────┼┐│ pod: autohttps svc: autohttps │ │ │
└────────────┘││ ┌────────────┐ ┌────────────┐ │ └────────────┘
│└ 80 ───────┬──┼─── 8000 ────────┼─┘
└─ 443 ──[*]─┘ │ │ │
└────────────┘ └────────────┘
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you saying that I should remove enabled: true, type: offload in the proxy section and set up https following the directions below?
ref: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/blob/master/doc/source/security.md#off-loading-ssl-to-a-load-balancer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to use GCP's HTTPS load balancer,
remove enabled: true, type: offload in the proxy section
Yes. (actually no problem to set that, but it's pointless)
set up https following the directions below?
No, that direction is not for you. It's for who want to set up https using service of type: LoadBalancer
, such as AWS ELB.
Note that k8s has 2 ways of representations of load balancers:
- Ingress
- e.g.: GCP's HTTP(S) LB
- Service of
type: LoadBalancer
- e.g.: AWS's ELB/NLB, GCP's TCP LB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any specific instructions or any documentation that I can follow in order to build load balancer for k8s? Thanks for the help!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build loadbalancer
? What do you mean by that? There is a service of type: LoadBalancer
or an Ingress
resource, which can also be a loadbalancer (usually used for SSL purposes or to have single loadbalancer for many services).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that I want to let my JupyterHub site to have a loadbalancer. What should I do? I read the section off-loading SSL to a load balancer from this website, but I am still confused on what should I do with my config.yaml in order to set up loadbalancer in the Google cloud platform.
Also is it necessary to offload SSL to a load balancer for JupyterHub? What is the purpose of doing that?
Any help would be appreciated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessary. E.g. if you are in a trusted network (e.g. at home) you don't need SSL (at least I wouldn't bother).
The reason for it for me was that we had SSL handled by Ingress, therefore there is no reason why to let SSL on for anything beyond Ingress. That's why it's off-loading. The proxy
is by default a proper k8s Service
resource with type LoadBalancer
, if that's what you want. Then there is a k8s Ingress resource, which as different type of a LoadBalancer. It's a bit confusing.
@zxcGrace 5 minutes may not be enough, google recommends 10-20 (don't ask me why 🙄 ). |
@hnykda I waited for more than 30 min and it is still not working. My config.yaml file looks like this:
Is there anything I did wrong? |
Well, did you change that |
I changed that to my domain and it was still not working. I wonder if it is the correct step to do in order to enable load balancer for the Google cloud platform. |
And any reports from the ingress? What was it's status? What did you get as an error message? Any more info than just "does not work" appreciated. |
No error reported when I updated my config file with the command The only thing went wrong was the site couldn't be opened, and I am not sure where to find reports from the ingress. |
"The site couldn't be opened" mean HTTP error 50X or 40X? |
Thank you for your work everyone. I've spent considerable amount of time reviewing and considering this PR already, and upon inspecting it again at this point in time, I consider it to be problematic to merge. Some key points for my stance:
My understanding summarized:
Suggested action point: Thank you everyone for your work deliberating on this! ❤️ Feel free to discuss changes we could make, but for now I'll go ahead and close this PR. I encourage opening an issue with a proposed change or a PR to configure the ingress resources |
Attempt to summarize by @consideRatio
About using the GCE Ingress Controller
This PR adds documentation (331f1cb) and two lines of code (87380cc) to use the GCE Ingress Controller to manage the optional ingress k8s resources that the z2jh helm chart can provide.
Additional parts (LGTM)
This PR also clarifies setup of GitHub authentication (60268f5) and adds an example to use the kubespawner's profile list feature (63fd107).
#1125