Credential provider sometimes identifies private repo as public repo #737
Labels
kind/bug
Categorizes issue or PR as related to a bug.
needs-triage
Indicates an issue or PR lacks a `triage/foo` label and requires one.
Credential provider sometimes identifies private repo as public repo, based on an image name that was copied from public to private. Our use case for this is to support application deployments in "air-gapped" environments, where all dependencies are prepackaged and installed in private image repos. When the credential provider ran in our private AWS environment, it failed to fetch the ECR credentials based on how we retagged the images, therefore our EKS deployments failed.
The existing credential provider implementation certainly works in majority of cases, so this may be a case of just violating the principle of least surprise in cases such as ours. See "Anything else we need to know?" section.
What happened:
When we bring over the ebs-csi driver images from the public aws registry, we just prepend our private registry name to the original public image name. So for example, if the original image is "public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver"", we retag it and push to our private ECR repo as "11111111111.dkr.ecr.us-gov-west-1.amazonaws.com/public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver". This is so we have some traceability of the source.
What you expected to happen:
For EKS deployments in our private environment, we expect the pods to pull images successfully from ECR after the credential provider fetches the configured private repo credentials.
Note that when specifying EKS 1.25, and using compatible k8s binaries, the images pull successfully in our private env. I believe the kubelet at version 1.25 uses the in-tree credential provider, while EKS 1.27 / kubelet 1.27 is the first iteration to require the external credential provider.
How to reproduce it (as minimally and precisely as possible):
Using any image from the ECR Public Gallery (public.ecr.aws), retag the image when pushing to a private ECR repository. The new tag should preserve the original image tag by only prefixing new tag characters. As the example:
Original image: "
public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver
"Private retagged image: "
11111111111.dkr.ecr.us-gov-west-1.amazonaws.com/public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver
"Use kubectl is apply a deployment manifest that uses an "image:" from the private repository.
Our EKS version is 1.27.x, and kubelet v1.27.5-eks-43840fb is configured correctly to use the external credential provider config file and binary v1.27.2.
Anything else we need to know?:
There's a strings.Contains statement (see here) that sees public.ecr.aws in the middle of our image name and mistakenly treats it like a public registry and thus fails to get the credentials. You can verify this running the ecr-credential-provider binary on one of the worker nodes.
The following command will fail because the repository name contains public.ecr.aws:
echo '{"kind": "CredentialProviderRequest", "apiVersion": "credentialprovider.kubelet.k8s.io/v1", "image": "11111111111.dkr.ecr.us-gov-east-1.amazonaws.com/public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver"}' | ./ecr-credential-provider
However, if we reference a different image without the public.ecr.aws, it succeeds:
echo '{"kind": "CredentialProviderRequest", "apiVersion": "credentialprovider.kubelet.k8s.io/v1", "image": "11111111111.dkr.ecr.us-gov-west-1.amazonaws.com/busybox"}' | ./ecr-credential-provider
Our workaround is that when we retag the image, we also modify the string "public.ecr.aws" to become "pub.ecr.aws". A complete example would be: "
11111111111.dkr.ecr.us-gov-east-1.amazonaws.com/pub.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver
"Environment:
kubectl version
): 1.27.xuname -a
):/kind bug
The text was updated successfully, but these errors were encountered: