Some Bottlerocket nodes stuck "NotReady" #3076

shay-ul · 2023-05-04T08:08:42Z

Image I'm using:
v1.25.6-eks-232056e

What I expected to happen:
During peak hours of our environment, Karpenter provisions many Bottlerocket nodes, and deletes them when the workload scales down.
Most of the Bottlerocket nodes get to a "Ready" state quickly. However, every once in a while, a newly provisioned node would be stuck "NotReady".

What actually happened:
When describing the node, we can see that the main issue is container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized.
Here is the full description of such node:

kubectl describe node

Name:               ip-192-168-1-160.us-east-2.compute.internal
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=c5a.2xlarge
                    beta.kubernetes.io/os=linux
                    default-provisioner-node=true
                    failure-domain.beta.kubernetes.io/region=us-east-2
                    failure-domain.beta.kubernetes.io/zone=us-east-2a
                    k8s.io/cloud-provider-aws=098bf2415525a68dfcc5f51ea16cedb4
                    karpenter.k8s.aws/instance-ami-id=ami-0ef1b4053f12bcee9
                    karpenter.k8s.aws/instance-category=c
                    karpenter.k8s.aws/instance-cpu=8
                    karpenter.k8s.aws/instance-encryption-in-transit-supported=true
                    karpenter.k8s.aws/instance-family=c5a
                    karpenter.k8s.aws/instance-generation=5
                    karpenter.k8s.aws/instance-hypervisor=nitro
                    karpenter.k8s.aws/instance-memory=16384
                    karpenter.k8s.aws/instance-network-bandwidth=2500
                    karpenter.k8s.aws/instance-pods=110
                    karpenter.k8s.aws/instance-size=2xlarge
                    karpenter.sh/capacity-type=on-demand
                    karpenter.sh/provisioner-name=default-provisioner
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-192-168-1-160.us-east-2.compute.internal
                    kubernetes.io/os=linux
                    node.kubernetes.io/instance-type=c5a.2xlarge
                    topology.kubernetes.io/region=us-east-2
                    topology.kubernetes.io/zone=us-east-2a
Annotations:        alpha.kubernetes.io/provided-node-ip: 192.168.1.160
                    csi.volume.kubernetes.io/nodeid: {"efs.csi.aws.com":"i-0847a1a51766838ac"}
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 03 May 2023 18:00:49 +0300
Taints:             node.kubernetes.io/not-ready:NoExecute
                    default-provisioner-node:NoSchedule
                    node.kubernetes.io/not-ready:NoSchedule
Unschedulable:      false
Lease:
  HolderIdentity:  ip-192-168-1-160.us-east-2.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Thu, 04 May 2023 09:36:12 +0300
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  Ready            False   Thu, 04 May 2023 09:35:07 +0300   Wed, 03 May 2023 18:01:03 +0300   KubeletNotReady              container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
  MemoryPressure   False   Thu, 04 May 2023 09:35:07 +0300   Wed, 03 May 2023 18:01:03 +0300   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Thu, 04 May 2023 09:35:07 +0300   Wed, 03 May 2023 18:01:03 +0300   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Thu, 04 May 2023 09:35:07 +0300   Wed, 03 May 2023 18:01:03 +0300   KubeletHasSufficientPID      kubelet has sufficient PID available
Addresses:
  InternalIP:   192.168.1.160
  Hostname:     ip-192-168-1-160.us-east-2.compute.internal
  InternalDNS:  ip-192-168-1-160.us-east-2.compute.internal
Capacity:
  attachable-volumes-aws-ebs:  25
  cpu:                         8
  ephemeral-storage:           61904460Ki
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      16173012Ki
  pods:                        110
Allocatable:
  attachable-volumes-aws-ebs:  25
  cpu:                         7910m
  ephemeral-storage:           55977408418
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      14570452Ki
  pods:                        110
System Info:
  Machine ID:                 ec2b8855586ec98db3f88d9a1761571d
  System UUID:                ec2b8855-586e-c98d-b3f8-8d9a1761571d
  Boot ID:                    15d9d4d5-75ce-4214-a4ab-b6d673674277
  Kernel Version:             5.15.102
  OS Image:                   Bottlerocket OS 1.13.5 (aws-k8s-1.25)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.6.19+bottlerocket
  Kubelet Version:            v1.25.6-eks-232056e
  Kube-Proxy Version:         v1.25.6-eks-232056e
ProviderID:                   aws:///us-east-2a/i-0847a1a51766838ac
Non-terminated Pods:          (2 in total)
  Namespace                   Name                      CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                      ------------  ----------  ---------------  -------------  ---
  datadog                     prod-datadog-pg8db    0 (0%)        0 (0%)      0 (0%)           0 (0%)         15h
  kube-system                 efs-csi-node-7g66f        0 (0%)        0 (0%)      0 (0%)           0 (0%)         15h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests  Limits
  --------                    --------  ------
  cpu                         0 (0%)    0 (0%)
  memory                      0 (0%)    0 (0%)
  ephemeral-storage           0 (0%)    0 (0%)
  hugepages-1Gi               0 (0%)    0 (0%)
  hugepages-2Mi               0 (0%)    0 (0%)
  attachable-volumes-aws-ebs  0         0
Events:
  Type    Reason                 Age                    From       Message
  ----    ------                 ----                   ----       -------
  Normal  DeprovisioningBlocked  42s (x162 over 5h57m)  karpenter  can't deprovision node due to NotInitialized

when exploring further, I found out that the pod that should initialize the vpc-cni-plugin (aws-node pod) is stuck "Pending" due to 1 Insufficient memory.:

kubectl describe pod aws-node

Name:                 aws-node-szbjq
Namespace:            kube-system
Priority:             2000001000
Priority Class Name:  system-node-critical
Service Account:      aws-node
Node:                 <none>
Labels:               app.kubernetes.io/instance=aws-vpc-cni
                      app.kubernetes.io/name=aws-node
                      controller-revision-hash=57d54b68b5
                      k8s-app=aws-node
                      pod-template-generation=11
Annotations:          <none>
Status:               Pending
IP:                   
IPs:                  <none>
Controlled By:        DaemonSet/aws-node
NominatedNodeName:    ip-192-168-1-160.us-east-2.compute.internal
Init Containers:
  aws-vpc-cni-init:
    Image:      602401143452.dkr.ecr.us-east-2.amazonaws.com/amazon-k8s-cni-init:v1.12.6-eksbuild.1
    Port:       <none>
    Host Port:  <none>
    Environment:
      DISABLE_TCP_EARLY_DEMUX:             false
      ENABLE_IPv6:                         false
      AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG:  true
      ENABLE_PREFIX_DELEGATION:            true
      WARM_PREFIX_TARGET:                  1
      WARM_IP_TARGET:                      5
      MINIMUM_IP_TARGET:                   2
      ENI_CONFIG_LABEL_DEF:                failure-domain.beta.kubernetes.io/zone
      AWS_STS_REGIONAL_ENDPOINTS:          regional
      AWS_DEFAULT_REGION:                  us-east-2
      AWS_REGION:                          us-east-2
      AWS_ROLE_ARN:                        arn:aws:iam::<account_id>:role/AmazonEKSVPCCNIRole
      AWS_WEB_IDENTITY_TOKEN_FILE:         /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    Mounts:
      /host/opt/cni/bin from cni-bin-dir (rw)
      /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-w6h5l (ro)
Containers:
  aws-node:
    Image:      602401143452.dkr.ecr.us-east-2.amazonaws.com/amazon-k8s-cni:v1.12.6-eksbuild.1
    Port:       61678/TCP
    Host Port:  61678/TCP
    Requests:
      cpu:      25m
    Liveness:   exec [/app/grpc-health-probe -addr=:50051 -connect-timeout=5s -rpc-timeout=5s] delay=60s timeout=10s period=10s #success=1 #failure=3
    Readiness:  exec [/app/grpc-health-probe -addr=:50051 -connect-timeout=5s -rpc-timeout=5s] delay=1s timeout=10s period=10s #success=1 #failure=3
    Environment:
      ADDITIONAL_ENI_TAGS:                    {}
      ANNOTATE_POD_IP:                        false
      AWS_VPC_CNI_NODE_PORT_SUPPORT:          true
      AWS_VPC_ENI_MTU:                        9001
      AWS_VPC_K8S_CNI_CONFIGURE_RPFILTER:     false
      AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG:     true
      AWS_VPC_K8S_CNI_EXTERNALSNAT:           false
      AWS_VPC_K8S_CNI_LOGLEVEL:               DEBUG
      AWS_VPC_K8S_CNI_LOG_FILE:               /host/var/log/aws-routed-eni/ipamd.log
      AWS_VPC_K8S_CNI_RANDOMIZESNAT:          prng
      AWS_VPC_K8S_CNI_VETHPREFIX:             eni
      AWS_VPC_K8S_PLUGIN_LOG_FILE:            /var/log/aws-routed-eni/plugin.log
      AWS_VPC_K8S_PLUGIN_LOG_LEVEL:           DEBUG
      CLUSTER_ENDPOINT:                       https://<endpoint>
      CLUSTER_NAME:                           prod
      DISABLE_INTROSPECTION:                  false
      DISABLE_METRICS:                        false
      DISABLE_NETWORK_RESOURCE_PROVISIONING:  false
      ENABLE_IPv4:                            true
      ENABLE_IPv6:                            false
      ENABLE_POD_ENI:                         false
      ENABLE_PREFIX_DELEGATION:               true
      VPC_ID:                                 <vpc>
      WARM_ENI_TARGET:                        1
      WARM_PREFIX_TARGET:                     1
      WARM_IP_TARGET:                         5
      MINIMUM_IP_TARGET:                      2
      ENI_CONFIG_LABEL_DEF:                   failure-domain.beta.kubernetes.io/zone
      MY_NODE_NAME:                            (v1:spec.nodeName)
      MY_POD_NAME:                            aws-node-szbjq (v1:metadata.name)
      AWS_STS_REGIONAL_ENDPOINTS:             regional
      AWS_DEFAULT_REGION:                     us-east-2
      AWS_REGION:                             us-east-2
      AWS_ROLE_ARN:                           arn:aws:iam::<aws_account>:role/AmazonEKSVPCCNIRole
      AWS_WEB_IDENTITY_TOKEN_FILE:            /var/run/secrets/eks.amazonaws.com/serviceaccount/token
    Mounts:
      /host/etc/cni/net.d from cni-net-dir (rw)
      /host/opt/cni/bin from cni-bin-dir (rw)
      /host/var/log/aws-routed-eni from log-dir (rw)
      /run/xtables.lock from xtables-lock (rw)
      /var/run/aws-node from run-dir (rw)
      /var/run/dockershim.sock from dockershim (rw)
      /var/run/secrets/eks.amazonaws.com/serviceaccount from aws-iam-token (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-w6h5l (ro)
Conditions:
  Type           Status
  PodScheduled   False 
Volumes:
  aws-iam-token:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  86400
  cni-bin-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /opt/cni/bin
    HostPathType:  
  cni-net-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/cni/net.d
    HostPathType:  
  dockershim:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/dockershim.sock
    HostPathType:  
  log-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log/aws-routed-eni
    HostPathType:  DirectoryOrCreate
  run-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/aws-node
    HostPathType:  DirectoryOrCreate
  xtables-lock:
    Type:          HostPath (bare host directory volume)
    Path:          /run/xtables.lock
    HostPathType:  
  kube-api-access-w6h5l:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 op=Exists
                             node.kubernetes.io/disk-pressure:NoSchedule op=Exists
                             node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/network-unavailable:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists
                             node.kubernetes.io/pid-pressure:NoSchedule op=Exists
                             node.kubernetes.io/unreachable:NoExecute op=Exists
                             node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
  Type     Reason            Age                   From               Message
  ----     ------            ----                  ----               -------
  Warning  FailedScheduling  51m (x27 over 56m)    default-scheduler  0/76 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  45m (x2 over 49m)     default-scheduler  0/75 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  40m (x12 over 43m)    default-scheduler  0/61 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  35m (x15 over 38m)    default-scheduler  0/39 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  11m (x96 over 32m)    default-scheduler  0/20 nodes are available: 1 Insufficient memory.
  Warning  FailedScheduling  55s (x29 over 7m35s)  default-scheduler  0/17 nodes are available: 1 Insufficient memory.

As you can see, It's hard to figure out why default-scheduler reports "Insufficient memory" since there is no indication of such issue on the description of the node.
It's important to note that more daemonset pods are stuck at the same state - kube-proxy and ebs-csi-node. However, efs-csi-node and Datadog agent appear to be in "Running" state.

How to reproduce the problem:
No idea.
Can you please help me figure out which log/metric should I look for?
Thanks!

The text was updated successfully, but these errors were encountered:

zmrow · 2023-05-04T22:47:36Z

Thanks for the report @shay-ul ! That's an odd failure case.

Is there anything different about the nodes that get stuck in "NotReady"? Are all of the nodes being launched the same instance type? What instance type are they?

Also, if you're able to get to any of the nodes, it would be helpful to jump on and look for any offending messages in the journal. var/log/aws-routed-eni also contains logs for ipamD.

Just in case you hadn't seen it, the amazon-vpc-cni-k8s page has troubleshooting docs as well.

shay-ul · 2023-05-05T03:16:19Z

@zmrow The failure is not limited to a specific instance type. In the exapmple above, there's a c5a.2xlarge instance, but this morning the same thing happend on a smaller node - c5a.xlarge.
As for the vpc-cni, I initially created a github issue for vpc-cni (aws/amazon-vpc-cni-k8s#2365) since the logs indicated an issue with the cni, but the cni can never be initialized if the aws-node pod never comes up in the first place.
The cni support script and logs directory are missing, since the cni in not initialized.
As I said, this behaviour is not specific to the aws-node pod, since kube-proxy and ebs-csi-node have the same issue, so it lead me to believe this is not a cni-relates problem.
The journal is swamped with logs from kubelet, such as this:
"Error syncing pod, skipping" err="network is not ready: container runtime network notready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"

I couldn't find more interesting logs in the journal. I can download the full journal and upload it to a support case if that can be of any help.

zmrow · 2023-05-05T15:28:50Z

Thanks! I missed asking yesterday - which version of Bottlerocket are you using? 1.13.4 had a runc issue causing high memory utilization

1.13.5 was cut that reverted the runc change and fixed the issue.

shay-ul · 2023-05-07T08:22:45Z

Update: we are currently investigating possible network/routing/security-group related issues, since we figured out that whenever we have such NotReady node, it is always provisioned with the same IP address and host name.

shay-ul added status/needs-triage Pending triage or re-evaluation type/bug Something isn't working labels May 4, 2023

shay-ul mentioned this issue May 4, 2023

CNI Plugin Not Initialized on some Bottlerocket nodes aws/amazon-vpc-cni-k8s#2365

Closed

zmrow added status/research This issue is being researched and removed status/needs-triage Pending triage or re-evaluation labels May 4, 2023

shay-ul closed this as completed May 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some Bottlerocket nodes stuck "NotReady" #3076

Some Bottlerocket nodes stuck "NotReady" #3076

shay-ul commented May 4, 2023

zmrow commented May 4, 2023

shay-ul commented May 5, 2023 •

edited

Loading

zmrow commented May 5, 2023 •

edited

Loading

shay-ul commented May 7, 2023

Some Bottlerocket nodes stuck "NotReady" #3076

Some Bottlerocket nodes stuck "NotReady" #3076

Comments

shay-ul commented May 4, 2023

zmrow commented May 4, 2023

shay-ul commented May 5, 2023 • edited Loading

zmrow commented May 5, 2023 • edited Loading

shay-ul commented May 7, 2023

shay-ul commented May 5, 2023 •

edited

Loading

zmrow commented May 5, 2023 •

edited

Loading