Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not write a resolvConf value in the global kubetconfiguration, write it dynamically per node #3034

Closed
ilia1243 opened this issue Mar 4, 2024 · 6 comments · Fixed by kubernetes/kubernetes#124038
Labels
area/kubelet help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.
Milestone

Comments

@ilia1243
Copy link

ilia1243 commented Mar 4, 2024

Is this a BUG REPORT or FEATURE REQUEST?

FEATURE REQUEST

Versions

kubeadm version (use kubeadm version): v1.29.1

Environment:

  • Kubernetes version (use kubectl version): v1.29.1
  • Cloud provider or hardware configuration: bare-metal
  • OS (e.g. from /etc/os-release): Ubuntu 22.04.1 LTS
  • Kernel (e.g. uname -a): 5.15.0-50-generic
  • Container runtime (CRI) (e.g. containerd, cri-o): containerd=1.6.12-0ubuntu1~22.04.3
  • Container networking plugin (CNI) (e.g. Calico, Cilium): calico
  • Others:

What happened?

If kubeadm init node on Ubuntu 20.04 and kubeadm join node on RHEL9, the joining fails with "open /run/systemd/resolve/resolv.conf: no such file or directory" in kubelet logs.

W/A: use patches or delete resolvConf from kubelet-config ConfigMap before joining.

What you expected to happen?

kubeadm init does not write default resolvConf in KubeletConfiguration kubelet-config ConfigMap. Instead, resolvConf is omitted in kubelet-config ConfigMap, and real value in /var/lib/kubelet/config.yaml is calculated dynamically depending on if systemd-resolved service is active.

How to reproduce it (as minimally and precisely as possible)?

See What happened?.

Anything else we need to know?

@neolit123
Copy link
Member

kubeadm init does not write default resolvConf in KubeletConfiguration. Instead, resolvConf is omitted, and real value in kubelet config.yaml in calculated dynamically depending on if systemd-resolved service is active.

this is intended.

kubeadm will only update the KubeletConfiguration.ResolverConfig field if the systemd-resolved service is active:
https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/componentconfigs/kubelet.go#L200-L213
/run/systemd/resolve/resolv.conf is a valid path if systemd-resolved is managing resolv.conf.

$ ls -l /run/systemd/resolve/resolv.conf
-rw-r--r-- 1 systemd-resolve systemd-resolve 786 Mar  4 15:24 /run/systemd/resolve/resolv.conf
$ systemctl status systemd-resolved | grep active
     Active: active (running) since Mon 2024-03-04 15:24:27 EET; 1min 17s ago

if the service is active but the file is missing, then that problem must be fixed.

if the service is not active the kubelet will default the field to /etc/resolv.conf:
https://github.com/kubernetes/kubelet/blob/master/config/v1beta1/types.go#L437-L442

@neolit123 neolit123 added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Mar 4, 2024
@neolit123 neolit123 added this to the v1.30 milestone Mar 4, 2024
@ilia1243
Copy link
Author

ilia1243 commented Mar 4, 2024

/run/systemd/resolve/resolv.conf is a valid path if systemd-resolved is managing resolv.conf

Please check the What happened?. If different OS are used, systemd-resolved is not managing resolv.conf for RHEL9, but kubelet tries to open the /run/systemd/resolve/resolv.conf.

@neolit123
Copy link
Member

again, if the service systemd-resolved is enabled the path passed to kubelet should be /run/systemd/resolve/resolv.conf.
if that is not correct on a certain distro, then it's a problem with systemd-resovled on that distro, i'd say.

@ilia1243
Copy link
Author

ilia1243 commented Mar 4, 2024

In the mentioned case systemd-resolved is disabled for RHEL9. Let me rephrase the test case:

  1. Init first Kubernetes node on Ubuntu 20.04. systemd-resolved is active.

    Actual: Kubeadm writes resolvConf: /run/systemd/resolve/resolv.conf in both kubelet-config ConfigMap and in /var/lib/kubelet/config.yaml.

    Proposed: Kubeadm writes resolvConf: /run/systemd/resolve/resolv.conf only in /var/lib/kubelet/config.yaml, but omits the resolvConf property in kubelet-config ConfigMap.

  2. Join second Kubernetes node on RHEL9. systemd-resolved is inactive.

    Actual: Kubeadm writes resolvConf: /run/systemd/resolve/resolv.conf in /var/lib/kubelet/config.yaml using the kubelet-config ConfigMap and kubelet fails.

    Proposed: Since the property is absent in the ConfigMap at step 1, Kubeadm uses the default /etc/resolv.conf in /var/lib/kubelet/config.yaml.

@ilia1243 ilia1243 changed the title Do not write default resolvConf in KubeletConfiguration, write default in kubelet config.yaml depending on systemd-resolved Do not write default resolvConf in kubelet-config ConfigMap, write default in kubelet config.yaml depending on systemd-resolved Mar 4, 2024
@neolit123
Copy link
Member

ok, now i understand the problem. this was not clear in your description.

so first of all, most of the users use the same distro or distro family for a single cluster, so kubeadm is correct for these users. over there systemd-resolved is really enabled or not.
if some node does not work with the default kubeletconfiguration then patches should be used. that is the correct solution.

what can be done to make kubeadm better here is to:

  1. don't write any defaults in the kubeletconfiguration about resolvConf
    (move this logic to 2)
    https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/componentconfigs/kubelet.go#L200-L213
  2. mutate the kubelet configuration for a given node after it's downloaded:
    https://github.com/kubernetes/kubernetes/blob/master/cmd/kubeadm/app/phases/kubelet/config.go#L49

@neolit123
Copy link
Member

neolit123 commented Mar 4, 2024

we are close to code freeze for 1.30. this can be changed for 1.31, but i don't think it should be backported.
we also need to understand if it's going to break existing users in some way.

PRs welcome, explained above:
#3034 (comment)

@neolit123 neolit123 added priority/backlog Higher priority than priority/awaiting-more-evidence. kind/feature Categorizes issue or PR as related to a new feature. area/kubelet and removed priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Mar 4, 2024
@neolit123 neolit123 modified the milestones: v1.30, v1.31 Mar 4, 2024
@neolit123 neolit123 added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Mar 4, 2024
@neolit123 neolit123 changed the title Do not write default resolvConf in kubelet-config ConfigMap, write default in kubelet config.yaml depending on systemd-resolved Do not write a resolvConf value in the global kubetconfiguration, write it dynamically per node Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
2 participants