Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not report problems until kube-apiserver is ready #295

Closed
yguo0905 opened this issue Jun 18, 2019 · 2 comments · Fixed by #308
Closed

Do not report problems until kube-apiserver is ready #295

yguo0905 opened this issue Jun 18, 2019 · 2 comments · Fixed by #308

Comments

@yguo0905
Copy link
Contributor

yguo0905 commented Jun 18, 2019

In #288, we changed NPD to run custom plugins on startup. I hoped this would allow NPD to always report an event immediately when the cluster is just created, no matter how big the invoke_internal is.

However, this will not always work due to its interaction with kube-apiserver. What I observed during cluster creation was below.

  1. NPD started and invoked the custom plugin immediately, and then sent an event to kube-apiserver.
  2. The event was failed to be sent because kube-apiserver was not running yet. The event library will retry sending the event.
    Unable to write event: 'Post https://x.x.x.x/api/v1/namespaces/default/events: dial tcp 3 4.68.6.201:443: connect: connection refused' (may retry after sleeping)
  3. kube-apiserver started.
  4. The event was re-sent to kube-apiserver but was rejected this time without further retry because of a permission error:
    events is forbidden: User "system:node-problem-detector" cannot create resource "events" in API group "" in the namespace "default"' (will not retry!)
  5. https://github.com/kubernetes/kubernetes/blob/c8b45cd25c18e65798dde49fc7011495ea6021d5/cluster/gce/gci/configure-helper.sh#L568 was called to set up the permission.

There is a small window between (3) and (5) - if the event is rejected during that interval the event will never be resent again.

Changing the event library to always retry on permission error may or may not make sense. But what we can do in NPD is to introduce a configurable initial_delay for custom plugins. In this case, I can configure it to 1m with invoke_internal still being 6h. The plugin will run after 1m when the NPD starts.

/cc @wangzhen127 @Random-Liu

@yguo0905
Copy link
Contributor Author

yguo0905 commented Jul 8, 2019

We've discussed 3 ways to solve the problem.

  1. Add a configurable timeout option to K8s exporter. On startup, NPD will NOT run any plugins (and be blocked in K8s export creation) until either apiserver is ready or the timeout occurs.

    • NPD metrics pipeline will be unnecessarily blocked for the timeout duration if NPD cannot connect to apiserver.
    • This doesn't solve the issue where some plugin must run with an initial delay, which is irrelevant to apiserver's availability.
  2. Similar to (1) but, instead of blocking NPD, we accumulate the events in k8s exporter.

    • Not easy to implement - we need to think about how to store the events and send them in a batch (considering QPS) when apiserver becomes ready.
    • This doesn't solve the issue where some plugin must run with an initial delay, which is irrelevant to apiserver's availability.
  3. Support a configurable initial delay in custom plugins. Instead of solving the problem in the exporter, we deal it at the plugin side.

    • This doesn't work for built-in plugins, but we can extend it in the future if needed.
    • Solve both problems, simple to implement, will not affect metrics pipeline.

I prefer (3), which is the easy way to solve the problem without any side effects on the existing behavior.

@yguo0905
Copy link
Contributor Author

yguo0905 commented Jul 8, 2019

Posting the comments from @Random-Liu in our offline discussion.

I have no concern about option 3, and I think the initial delay is something we can support if it is needed in some use cases.

However, I feel like we should not use it to solve the apiserver initial connection problem, because the problem is not specific to any plugin. It is not conceptually correct to use a per-plugin config option to work around that problem, which it is a hack to me. If possible, I prefer we solve the problem in the k8s exporter with either option 1 or 2.

As for the health monitor, if it needs the initial delay, we can add it as well, but that should not be used to solve the apiserver initial connection problem.

I will follow the advice and go with option (1).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant