Skip to content

Affinity cookie not updated if invalid cookie is sent #3317

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
joushx opened this issue Oct 29, 2018 · 18 comments
Closed

Affinity cookie not updated if invalid cookie is sent #3317

joushx opened this issue Oct 29, 2018 · 18 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@joushx
Copy link

joushx commented Oct 29, 2018

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/.):

No

What keywords did you search in NGINX Ingress controller issues before filing this one? (If you have found any duplicates, you should instead reply there.):

affinity, session, sticky, update, cookie


Is this a BUG REPORT or FEATURE REQUEST? (choose one):

BUG REPORT

NGINX Ingress controller version:

since 0.18.0

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.1", GitCommit:"4ed3216f3ec431b140b1d899130a69fc671678f4", GitTreeState:"clean", BuildDate:"2018-10-05T16:46:06Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-07T23:08:19Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Cloud provider or hardware configuration: Bare metal
  • OS (e.g. from /etc/os-release): Ubuntu 16.04.4 LTS (Xenial Xerus)
  • Kernel (e.g. uname -a): Linux Ubuntu-1604-xenial-64-minimal 4.15.0-33-generic Add nginx metrics to prometheus #36~16.04.1-Ubuntu SMP Wed Aug 15 17:21:05 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
  • Install tools:
  • Others:

What happened:

No Set-Cookie header is set

What you expected to happen:

Set-Cookie header with a hash

How to reproduce it (as minimally and precisely as possible):

Send a request with Cookie: INGRESSCOOKIE=foobar

Anything else we need to know:

Worked until version 0.18.

0.18:

*   Trying 148.251.XXX.XXX...
* TCP_NODELAY set
* Connected to **********.com (148.251.XXX.XXX) port 444 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=*.********.com
*  start date: Oct 17 06:25:15 2018 GMT
*  expire date: Jan 15 06:25:15 2019 GMT
*  subjectAltName: host "********" matched cert's "*.********.com"
*  issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55d70debd8e0)
> GET /management/health HTTP/2
> Host: ***********.com:444
> User-Agent: curl/7.58.0
> Accept: */*
> Cookie: cv-sid=aa; Domain=**********.com; Path=/; HttpOnly
> 
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 200 
< server: nginx/1.15.2
< date: Mon, 29 Oct 2018 12:18:34 GMT
< content-type: application/vnd.spring-boot.actuator.v1+json;charset=UTF-8
< x-application-context: ********:kubernetes:8443
< x-content-type-options: nosniff
< x-xss-protection: 1; mode=block
< cache-control: no-cache, no-store, max-age=0, must-revalidate
< pragma: no-cache
< expires: 0
< strict-transport-security: max-age=15724800; includeSubDomains

0.17.1:

*   Trying 148.251.XX.XX...
* TCP_NODELAY set
* Connected to *******.com (148.251.XX.XX) port 444 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=********.com
*  start date: Oct 17 06:25:15 2018 GMT
*  expire date: Jan 15 06:25:15 2019 GMT
*  subjectAltName: host "***********.com" matched cert's "*.********.com"
*  issuer: C=US; O=Let's Encrypt; CN=Let's Encrypt Authority X3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x558083bdb8e0)
> GET /management/health HTTP/2
> Host: ***********.com:444
> User-Agent: curl/7.58.0
> Accept: */*
> Cookie: cv-sid=aa; Domain=**********.com; Path=/; HttpOnly
> 
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2 200 
< server: nginx/1.13.12
< date: Mon, 29 Oct 2018 12:16:15 GMT
< content-type: application/vnd.spring-boot.actuator.v1+json;charset=UTF-8
< set-cookie: cv-sid=45cfdc84c88a75d0370efa9d2813aafca983ccb7; Path=/; HttpOnly
< x-application-context: ********:kubernetes:8443
< x-content-type-options: nosniff
< x-xss-protection: 1; mode=block
< cache-control: no-cache, no-store, max-age=0, must-revalidate
< pragma: no-cache
< expires: 0
< strict-transport-security: max-age=15724800; includeSubDomains
@joushx joushx changed the title Affinity cookie not set if invalid cookie is sent or pod does not exist anymore Affinity cookie not updated if invalid cookie is sent Oct 29, 2018
@joushx
Copy link
Author

joushx commented Oct 29, 2018

In found this PDF. After "Is there an upstream
server corresponding to the "route" cookie ?", a new cookie should be set.

@ElvinEfendi
Copy link
Member

@joushx that PDF is not relevant anymore because we don't use nginx-sticky-module. We have Lua implementation (https://github.com/kubernetes/ingress-nginx/blob/master/rootfs/etc/nginx/lua/balancer/sticky.lua) and it's not as complete as the Nginx module. Currently there's no way to tell whether the cookie is invalid - Lua code needs some changes.

@joushx
Copy link
Author

joushx commented Oct 31, 2018

Thank you. As we need this functionality, we switched to traefik in the meantime.

@joushx joushx closed this as completed Oct 31, 2018
@ElvinEfendi
Copy link
Member

/reopen

@k8s-ci-robot
Copy link
Contributor

@ElvinEfendi: Reopening this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Oct 31, 2018
@ElvinEfendi
Copy link
Member

@joushx if you don't mind I'd like to keep this issue open in case someone decides to address it.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 29, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 28, 2019
@efedunin
Copy link

Same problem with version 0.23.0: modified by user affinity cookie is not regenerated by nginx ingress.
This contradicts to the statement in docs:

If the user changes this cookie, NGINX creates a new one and redirect the user to another upstream.

Any plans to fix this bug?

@ElvinEfendi
Copy link
Member

@efedunin Nginx will likely pick the new upstream, however it won't regenerate a new cookie (yes the documentation is not up-to-date).

What's the bug here, the misleading documentation or the fact that Nginx does not recreate the cookie?

@ElvinEfendi
Copy link
Member

@joushx can you help us understand why do you need to verify validity of the cookie? The sole purpose of cookie affinity is to steer clients to the same upstream (which happens) and if a user deliberately decides to change the cookie they might end up in a different upstream.

@aledbf aledbf removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 27, 2019
@joushx
Copy link
Author

joushx commented Mar 27, 2019

Our problem is that upstream servers may become unavailable. In this case clients should be redirected to a different server.

@ElvinEfendi
Copy link
Member

ElvinEfendi commented Mar 27, 2019

@joushx if an upstream server become unavailable then it will get removed from consistent hash ring and for the subsequent requests with same cookie there will be a different upstream chosen from the ring.

In other words ingress-nginx implementation of cookie affinity does not store upstream address/id in the cookie, it stores a random unique key (per sticky client) that gets mapped to an available upstream using consistent hashing algorithm. So you will not run into the issue you describe.

However note that we had an issue until version 0.23.0 where we were not refreshing the list of upstreams when an upstream is added or removed (became unavailable) which got fixed in #3809. So you can observe the incorrect behaviour you described in earlier versions but since 0.23.0 everything should be working as expected.

@efedunin
Copy link

What's the bug here, the misleading documentation or the fact that Nginx does not recreate the cookie?

Sorry for disturbance. I was confused by outdated docs.
After reading about consistent hashing algorithm (please, note that link to ketama at annotations info page is dead) I understood that all cookie values are valid. 😃

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 25, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 25, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

6 participants