Add retries in pluto
to handle eventual-consistent EC2 private DNS names
#3363
Labels
area/kubernetes
K8s including EKS, EKS-A, and including VMW
status/needs-triage
Pending triage or re-evaluation
In rare cases, calls to EC2 DescribeInstance will return an empty string for the instance's private DNS name when the instance is newly launched. The API response eventually settles as the private DNS name of the instance becomes consistent. This poses a problem when
pluto
is determining the node name based on the private DNS name of the instance.This issue is described in more detail in kubernetes/cloud-provider-aws#635. The fix for the in-tree cloud provider is in kubernetes/kubernetes#118421 which would cover all pre-1.27 clusters.
However for 1.27+ clusters, we no longer use the in-tree cloud provider and depend on the external cloud-provider (See #3033). Therefore we need to add similar retry logic to fix this issue in
pluto
.See awslabs/amazon-eks-ami#1383 for the corresponding fix on the EKS optimized AMI side.
The text was updated successfully, but these errors were encountered: