Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNI Plugin Not Initialized on some Bottlerocket nodes #2365

Closed
shay-ul opened this issue Apr 30, 2023 · 6 comments
Closed

CNI Plugin Not Initialized on some Bottlerocket nodes #2365

shay-ul opened this issue Apr 30, 2023 · 6 comments
Labels

Comments

@shay-ul
Copy link

shay-ul commented Apr 30, 2023

What happened:
We are running Bottlerocket nodes on EKS 1.25 with Karpenter and some very basic user data:

[settings]
[settings.kubernetes]
allowed-unsafe-sysctls = ["net.core.somaxconn"]
registry-qps = 20

The cluster is configured for Custom Networking and prefix-delegation.

Occasunally (not too often) a node will be stuck NotReady. While exploring the journal on the node, we see the usual error which indicates something is not working with the VPC CNI:

"Error syncing pod, skipping" err="network is not ready: container runtime network notready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"

We do not have IP Address shortage on our subnets.
The interesting thing is that the /var/log/aws-routed-eni directory is missing, so is the suppot script which should be located at /opt/cni/bin/aws-cni-support.sh

I have read that for the CNI Plugin to initialize (IPAMD), the aws-node daemonset should spin up on the node first.
The aws-node pod that should be scheduled to the node is stuck Pending and has the following events:

  Type     Reason            Age                    From               Message
  ----     ------            ----                   ----               -------
  Warning  FailedScheduling  59m (x9 over 62m)      default-scheduler  0/65 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  54m (x47 over 82m)     default-scheduler  0/74 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  49m (x7 over 50m)      default-scheduler  0/80 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  44m (x43 over 80m)     default-scheduler  0/79 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  34m (x36 over 41m)     default-scheduler  0/83 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  24m (x33 over 31m)     default-scheduler  0/78 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  19m (x5 over 19m)      default-scheduler  0/76 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  14m (x5 over 14m)      default-scheduler  0/72 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  9m14s (x9 over 10m)    default-scheduler  0/69 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  4m6s (x11 over 8m52s)  default-scheduler  0/68 nodes are available: 1 Insufficient memory.

I cannot wrap my head around why "Insufficient memory" is even a thing, since we're speaking off a c5a.2xlarge node.

Attach logs

What you expected to happen:

A node should spin up Ready, aws-routed-eni directory with ipamd logs should exist, and aws-cni-support.sh should also exist.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:
EKS 1.25

  • Kubernetes version (use kubectl version):
    Major:"1", Minor:"25+", GitVersion:"v1.25.6-eks-48e63af"
  • CNI Version
    v1.12.6-eksbuild.1
  • OS (e.g: cat /etc/os-release):
    Bottlerocket OS 1.13.4 (aws-k8s-1.25)
  • Kernel (e.g. uname -a):
    5.15.102
@shay-ul shay-ul added the bug label Apr 30, 2023
@jdn5126
Copy link
Contributor

jdn5126 commented May 1, 2023

@shay-ul it is the aws-node pod that creates /var/log/aws-routed-eni and installs /opt/cni/bin/aws-cni-support.sh on initialization, so they will not be present if the aws-node pod cannot run. Do you have a memory request setting specified for aws-node pod? If memory requests are specified and there is not enough memory available for a daemonset pod, then other pods should be deleted to make room. Do you have other pods on this node, and are they stuck in "Terminating"?

When you describe the node, do you see memory consumption at 100%?

@shay-ul
Copy link
Author

shay-ul commented May 3, 2023

Thanks for the feedback, unfortunately (or not?) I'm still waiting for this to happen again, for some reason all nodes worked perfectly for the past couple of days.

@jdn5126
Copy link
Contributor

jdn5126 commented May 3, 2023

Gotcha, so I would check in /var/log/messages to see if there are any kernel logs related to memory issues. Other than that, you can install the EKS log collector script manually from https://github.com/awslabs/amazon-eks-ami/blob/master/log-collector-script/linux/eks-log-collector.sh, though it is probably already there if the nodes are running fine now. The relevant logs may have rolled over by now, but yeah if this happens again, definitely try to run the log collector script as soon as possible

@shay-ul
Copy link
Author

shay-ul commented May 4, 2023

@jdn5126 We're speaking about nodes which are provisioned with Karpenter and never get to a "Ready" state. Most of the nodes do not have this issue, but every once in a while we have a newly provisioned node which is stuck NotReady.
This happend this morning as well, and this is the kubectl describe of the NotReady node:

kubectl describe node
Name:               ip-192-168-1-160.us-east-2.compute.internal
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=c5a.2xlarge
                    beta.kubernetes.io/os=linux
                    default-provisioner-node=true
                    failure-domain.beta.kubernetes.io/region=us-east-2
                    failure-domain.beta.kubernetes.io/zone=us-east-2a
                    k8s.io/cloud-provider-aws=098bf2415525a68dfcc5f51ea16cedb4
                    karpenter.k8s.aws/instance-ami-id=ami-0ef1b4053f12bcee9
                    karpenter.k8s.aws/instance-category=c
                    karpenter.k8s.aws/instance-cpu=8
                    karpenter.k8s.aws/instance-encryption-in-transit-supported=true
                    karpenter.k8s.aws/instance-family=c5a
                    karpenter.k8s.aws/instance-generation=5
                    karpenter.k8s.aws/instance-hypervisor=nitro
                    karpenter.k8s.aws/instance-memory=16384
                    karpenter.k8s.aws/instance-network-bandwidth=2500
                    karpenter.k8s.aws/instance-pods=110
                    karpenter.k8s.aws/instance-size=2xlarge
                    karpenter.sh/capacity-type=on-demand
                    karpenter.sh/provisioner-name=default-provisioner
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-192-168-1-160.us-east-2.compute.internal
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=c5a.2xlarge
                    topology.kubernetes.io/region=us-east-2
                    topology.kubernetes.io/zone=us-east-2a
Annotations:        alpha.kubernetes.io/provided-node-ip: 192.168.1.160
                    csi.volume.kubernetes.io/nodeid: {"efs.csi.aws.com":"i-0847a1a51766838ac"}
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 03 May 2023 18:00:49 +0300
Taints:             node.kubernetes.io/not-ready:NoExecute
                    default-provisioner-node:NoSchedule
                    node.kubernetes.io/not-ready:NoSchedule
Unschedulable:      false
Lease:
  HolderIdentity:  ip-192-168-1-160.us-east-2.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Thu, 04 May 2023 09:36:12 +0300
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  Ready            False   Thu, 04 May 2023 09:35:07 +0300   Wed, 03 May 2023 18:01:03 +0300   KubeletNotReady              container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
  MemoryPressure   False   Thu, 04 May 2023 09:35:07 +0300   Wed, 03 May 2023 18:01:03 +0300   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Thu, 04 May 2023 09:35:07 +0300   Wed, 03 May 2023 18:01:03 +0300   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Thu, 04 May 2023 09:35:07 +0300   Wed, 03 May 2023 18:01:03 +0300   KubeletHasSufficientPID      kubelet has sufficient PID available
Addresses:
  InternalIP:   192.168.1.160
  Hostname:     ip-192-168-1-160.us-east-2.compute.internal
  InternalDNS:  ip-192-168-1-160.us-east-2.compute.internal
Capacity:
  attachable-volumes-aws-ebs:  25
  cpu:                         8
  ephemeral-storage:           61904460Ki
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      16173012Ki
  pods:                        110
Allocatable:
  attachable-volumes-aws-ebs:  25
  cpu:                         7910m
  ephemeral-storage:           55977408418
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      14570452Ki
  pods:                        110
System Info:
  Machine ID:                 ec2b8855586ec98db3f88d9a1761571d
  System UUID:                ec2b8855-586e-c98d-b3f8-8d9a1761571d
  Boot ID:                    15d9d4d5-75ce-4214-a4ab-b6d673674277
  Kernel Version:             5.15.102
  OS Image:                   Bottlerocket OS 1.13.5 (aws-k8s-1.25)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.6.19+bottlerocket
  Kubelet Version:            v1.25.6-eks-232056e
  Kube-Proxy Version:         v1.25.6-eks-232056e
ProviderID:                   aws:///us-east-2a/i-0847a1a51766838ac
Non-terminated Pods:          (2 in total)
  Namespace                   Name                      CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                      ------------  ----------  ---------------  -------------  ---
  datadog                     prod-datadog-pg8db    0 (0%)        0 (0%)      0 (0%)           0 (0%)         15h
  kube-system                 efs-csi-node-7g66f        0 (0%)        0 (0%)      0 (0%)           0 (0%)         15h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests  Limits
  --------                    --------  ------
  cpu                         0 (0%)    0 (0%)
  memory                      0 (0%)    0 (0%)
  ephemeral-storage           0 (0%)    0 (0%)
  hugepages-1Gi               0 (0%)    0 (0%)
  hugepages-2Mi               0 (0%)    0 (0%)
  attachable-volumes-aws-ebs  0         0
Events:
  Type    Reason                 Age                    From       Message
  ----    ------                 ----                   ----       -------
  Normal  DeprovisioningBlocked  42s (x162 over 5h57m)  karpenter  can't deprovision node due to NotInitialized

this is the output of kubectl describe for the aws-node pod:

kubectl describe pod aws-node
Name:                 aws-node-szbjq
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Service Account:      aws-node
Node:                 <none>
Labels:               app.kubernetes.io/instance=aws-vpc-cni
                      app.kubernetes.io/name=aws-node
                      controller-revision-hash=57d54b68b5
                      k8s-app=aws-node
                      pod-template-generation=11
Annotations:          <none>
Status:               Pending
IP:                   
IPs:                  <none>
Controlled By:        DaemonSet/aws-node
NominatedNodeName:    ip-192-168-1-160.us-east-2.compute.internal
Init Containers:
  aws-vpc-cni-init:
    Image:      602401143452.dkr.ecr.us-east-2.amazonaws.com/amazon-k8s-cni-init:v1.12.6-eksbuild.1
    Port:       <none>
    Host Port:  <none>
    Environment:
      DISABLE_TCP_EARLY_DEMUX:             false
      ENABLE_IPv6:                         false
      AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG:  true
      ENABLE_PREFIX_DELEGATION:            true
      WARM_PREFIX_TARGET:                  1
      WARM_IP_TARGET:                      5
      MINIMUM_IP_TARGET:                   2
      ENI_CONFIG_LABEL_DEF:                failure-domain.beta.kubernetes.io/zone
      AWS_STS_REGIONAL_ENDPOINTS:          regional
      AWS_DEFAULT_REGION:                  us-east-2
      AWS_REGION:                          us-east-2
      AWS_ROLE_ARN:                        arn:aws:iam::<account_id>:role/AmazonEKSVPCCNIRole
      AWS_WEB_IDENTITY_TOKEN_FILE:         /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    Mounts:
      /host/opt/cni/bin from cni-bin-dir (rw)
      /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-w6h5l (ro)
Containers:
  aws-node:
    Image:      602401143452.dkr.ecr.us-east-2.amazonaws.com/amazon-k8s-cni:v1.12.6-eksbuild.1
    Port:       61678/TCP
    Host Port:  61678/TCP
    Requests:
      cpu:      25m
    Liveness:   exec [/app/grpc-health-probe -addr=:50051 -connect-timeout=5s -rpc-timeout=5s] delay=60s timeout=10s period=10s #success=1 #failure=3
    Readiness:  exec [/app/grpc-health-probe -addr=:50051 -connect-timeout=5s -rpc-timeout=5s] delay=1s timeout=10s period=10s #success=1 #failure=3
    Environment:
      ADDITIONAL_ENI_TAGS:                    {}
      ANNOTATE_POD_IP:                        false
      AWS_VPC_CNI_NODE_PORT_SUPPORT:          true
      AWS_VPC_ENI_MTU:                        9001
      AWS_VPC_K8S_CNI_CONFIGURE_RPFILTER:     false
      AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG:     true
      AWS_VPC_K8S_CNI_EXTERNALSNAT:           false
      AWS_VPC_K8S_CNI_LOGLEVEL:               DEBUG
      AWS_VPC_K8S_CNI_LOG_FILE:               /host/var/log/aws-routed-eni/ipamd.log
      AWS_VPC_K8S_CNI_RANDOMIZESNAT:          prng
      AWS_VPC_K8S_CNI_VETHPREFIX:             eni
      AWS_VPC_K8S_PLUGIN_LOG_FILE:            /var/log/aws-routed-eni/plugin.log
      AWS_VPC_K8S_PLUGIN_LOG_LEVEL:           DEBUG
      CLUSTER_ENDPOINT:                       https://<endpoint>
      CLUSTER_NAME:                           prod
      DISABLE_INTROSPECTION:                  false
      DISABLE_METRICS:                        false
      DISABLE_NETWORK_RESOURCE_PROVISIONING:  false
      ENABLE_IPv4:                            true
      ENABLE_IPv6:                            false
      ENABLE_POD_ENI:                         false
      ENABLE_PREFIX_DELEGATION:               true
      VPC_ID:                                 <vpc>
      WARM_ENI_TARGET:                        1
      WARM_PREFIX_TARGET:                     1
      WARM_IP_TARGET:                         5
      MINIMUM_IP_TARGET:                      2
      ENI_CONFIG_LABEL_DEF:                   failure-domain.beta.kubernetes.io/zone
      MY_NODE_NAME:                            (v1:spec.nodeName)
      MY_POD_NAME:                            aws-node-szbjq (v1:metadata.name)
      AWS_STS_REGIONAL_ENDPOINTS:             regional
      AWS_DEFAULT_REGION:                     us-east-2
      AWS_REGION:                             us-east-2
      AWS_ROLE_ARN:                           arn:aws:iam::<aws_account>:role/AmazonEKSVPCCNIRole
      AWS_WEB_IDENTITY_TOKEN_FILE:            /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    Mounts:
      /host/etc/cni/net.d from cni-net-dir (rw)
      /host/opt/cni/bin from cni-bin-dir (rw)
      /host/var/log/aws-routed-eni from log-dir (rw)
      /run/xtables.lock from xtables-lock (rw)
      /var/run/aws-node from run-dir (rw)
      /var/run/dockershim.sock from dockershim (rw)
      /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-w6h5l (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  aws-iam-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  86400
  cni-bin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /opt/cni/bin
    HostPathType:  
  cni-net-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:  
  dockershim:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/dockershim.sock
    HostPathType:  
  log-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log/aws-routed-eni
    HostPathType:  DirectoryOrCreate
  run-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/aws-node
    HostPathType:  DirectoryOrCreate
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  
  kube-api-access-w6h5l:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 op=Exists
                             node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason            Age                   From               Message
  ----     ------            ----                  ----               -------
  Warning  FailedScheduling  51m (x27 over 56m)    default-scheduler  0/76 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  45m (x2 over 49m)     default-scheduler  0/75 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  40m (x12 over 43m)    default-scheduler  0/61 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  35m (x15 over 38m)    default-scheduler  0/39 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  11m (x96 over 32m)    default-scheduler  0/20 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  55s (x29 over 7m35s)  default-scheduler  0/17 nodes are available: 1 Insufficient memory.

As you can see this situation is very strange since the node doesn't report requests, so why is default-scheduler declaring insufficient memory?
running free -h on the node returns:

               total        used        free      shared  buff/cache   available
Mem:            15Gi       416Mi        12Gi       1.0Mi       2.9Gi        14Gi

running journalctl -u kubelet returns many of these logs:

May 04 06:53:19 ip-192-168-1-160.us-east-2.compute.internal kubelet[1586]: E0504 06:53:19.163785    1586 pod_workers.go:965] "Error syncing pod, skipping" err="network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized" pod="datadog/prod-datadog-pg8db" podUID=d8960bfb-465f-4b05-8c16-3f80b5bba381
May 04 06:53:21 ip-192-168-1-160.us-east-2.compute.internal kubelet[1586]: E0504 06:53:21.163896    1586 pod_workers.go:965] "Error syncing pod, skipping" err="network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized" pod="datadog/prod-datadog-pg8db" podUID=d8960bfb-465f-4b05-8c16-3f80b5bba381

There is no memory request for the aws-node daemonset, only CPU request. I manually added memory requests for the pod, but the pod is still stuck Pending.
kube-proxy daemonset pod on the node is in the same state., which leads me to believe this is not vpc-cni related problem.

Update:
I created a new issue for Bottlerocket - bottlerocket-os/bottlerocket#3076
I'm closing this since I don't believe this is a vpc-cni related issue.

Thanks!

@shay-ul shay-ul closed this as completed May 4, 2023
@github-actions
Copy link

github-actions bot commented May 4, 2023

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@jdn5126
Copy link
Contributor

jdn5126 commented May 4, 2023

@shay-ul Got it, I subscribed to bottlerocket-os/bottlerocket#3076 so that I can follow along. One note, it looks like your aws-node pod is still mounting the dockershim socket: /var/run/dockershim.sock. This was a known issue we had in the past when upgrading to v1.12.0+ (as opposed to a fresh install). It does not hurt anything, but you can manually remove that mount from the daemonset with kubectl edit ds aws-node -n kube-system

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants