Delay needed before calling asyncServerStop()? #31

alanchristensen · 2018-01-31T17:12:54Z

I tried to use terminus for some projects at work but during an application update some requests were getting dropped.

I was able to fix this by adding a delay before calling asyncServerStop().

Have you guys seen this as well?

I've created https://github.com/alanchristensen/kubeplayground to help you reproduce what I'm seeing.

Once you got the app running, throw some load at it using wrk:

wrk -d 2m http://localhost:8080/ -H 'Connection: Close'

and then run the update script to rebuild the image and deploy it.

In wrk I then see:

Running 2m test @ http://localhost:8080/
  2 threads and 10 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   206.54ms    6.36ms 270.10ms   91.07%
    Req/Sec    23.85      9.68    50.00     64.11%
  1378 requests in 2.00m, 199.16KB read
  Socket errors: connect 0, read 14, write 0, timeout 0
Requests/sec:     11.47
Transfer/sec:      1.66KB

The 14 read errors are requests that get dropped. I think what's going on is that Kubernetes will continue to send traffic to a container for a short bit even after it sends the container SIGTERM.

The text was updated successfully, but these errors were encountered:

gergelyke · 2018-02-01T20:28:56Z

Thanks a lot for the report Alan!

alanchristensen · 2018-02-01T21:34:22Z

@gergelyke I've pushed a change to my kubeplayground repo to fix a logging statement.

I can eliminate those read errors by changing the process.on(signal, cleanup); to process.on(signal, () => { setTimeout(cleanup, 10000); });

JacopoDaeli · 2018-04-11T23:03:20Z

I am looking into this @alanchristensen @gergelyke.

gergelyke · 2018-05-07T15:59:57Z

@JacopoDaeli do you have any update on this one?

JacopoDaeli · 2018-05-08T08:52:59Z

@gergelyke, unfortunately, I did not have time yet to look into this. But I will look into this asap.

peterkuiper · 2018-06-08T15:29:14Z

There is a long discussion here. This "lag" is caused by kube-proxy. So, your pod is shutting down and is still receiving traffic.

There is no synchronization between the SIGTERM handler and the health check. If I understand correctly, your health check should fail if you receive a SIGTERM signal and should wait a little bit before shutting down. This way, traffic that came into your pod after the SIGTERM will still be handled, see this comment.

Some people are using preStop handlers in their k8s config which, I think, is basically what you are doing with the setTimeout. This is a bit "hacky" but it seems to work.

A solution could be to add a flag isShutdown or whatever and do something like this in your health check:

const terminus = require('@godaddy/terminus');
...

if (terminus.isShutdown) {
  // fail health check
}

gergelyke · 2018-06-08T16:36:04Z

ah, thanks a lot @peterkuiper for this!

I'll put together a PR addressing this

gergelyke added the bug label Feb 9, 2018

gergelyke mentioned this issue Jun 8, 2018

fix(on-signal): fail healthchecks on signal rec #55

Merged

gergelyke closed this as completed in #55 Jun 8, 2018

silasbw mentioned this issue Jun 19, 2018

docs(kubernetes): explain readiness probe #58

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delay needed before calling asyncServerStop()? #31

Delay needed before calling asyncServerStop()? #31

alanchristensen commented Jan 31, 2018 •

edited

Loading

gergelyke commented Feb 1, 2018 •

edited

Loading

alanchristensen commented Feb 1, 2018

JacopoDaeli commented Apr 11, 2018

gergelyke commented May 7, 2018

JacopoDaeli commented May 8, 2018

peterkuiper commented Jun 8, 2018 •

edited

Loading

gergelyke commented Jun 8, 2018

Delay needed before calling asyncServerStop()? #31

Delay needed before calling asyncServerStop()? #31

Comments

alanchristensen commented Jan 31, 2018 • edited Loading

gergelyke commented Feb 1, 2018 • edited Loading

alanchristensen commented Feb 1, 2018

JacopoDaeli commented Apr 11, 2018

gergelyke commented May 7, 2018

JacopoDaeli commented May 8, 2018

peterkuiper commented Jun 8, 2018 • edited Loading

gergelyke commented Jun 8, 2018

alanchristensen commented Jan 31, 2018 •

edited

Loading

gergelyke commented Feb 1, 2018 •

edited

Loading

peterkuiper commented Jun 8, 2018 •

edited

Loading