Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Credential provider sometimes identifies private repo as public repo #737

Closed
thecodebeneath opened this issue Nov 8, 2023 · 5 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@thecodebeneath
Copy link

Credential provider sometimes identifies private repo as public repo, based on an image name that was copied from public to private. Our use case for this is to support application deployments in "air-gapped" environments, where all dependencies are prepackaged and installed in private image repos. When the credential provider ran in our private AWS environment, it failed to fetch the ECR credentials based on how we retagged the images, therefore our EKS deployments failed.

The existing credential provider implementation certainly works in majority of cases, so this may be a case of just violating the principle of least surprise in cases such as ours. See "Anything else we need to know?" section.

What happened:
When we bring over the ebs-csi driver images from the public aws registry, we just prepend our private registry name to the original public image name. So for example, if the original image is "public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver"", we retag it and push to our private ECR repo as "11111111111.dkr.ecr.us-gov-west-1.amazonaws.com/public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver". This is so we have some traceability of the source.

What you expected to happen:
For EKS deployments in our private environment, we expect the pods to pull images successfully from ECR after the credential provider fetches the configured private repo credentials.

Note that when specifying EKS 1.25, and using compatible k8s binaries, the images pull successfully in our private env. I believe the kubelet at version 1.25 uses the in-tree credential provider, while EKS 1.27 / kubelet 1.27 is the first iteration to require the external credential provider.

How to reproduce it (as minimally and precisely as possible):
Using any image from the ECR Public Gallery (public.ecr.aws), retag the image when pushing to a private ECR repository. The new tag should preserve the original image tag by only prefixing new tag characters. As the example:
Original image: "public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver"
Private retagged image: "11111111111.dkr.ecr.us-gov-west-1.amazonaws.com/public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver"

Use kubectl is apply a deployment manifest that uses an "image:" from the private repository.

Our EKS version is 1.27.x, and kubelet v1.27.5-eks-43840fb is configured correctly to use the external credential provider config file and binary v1.27.2.

Anything else we need to know?:
There's a strings.Contains statement (see here) that sees public.ecr.aws in the middle of our image name and mistakenly treats it like a public registry and thus fails to get the credentials. You can verify this running the ecr-credential-provider binary on one of the worker nodes.

The following command will fail because the repository name contains public.ecr.aws:

echo '{​​​​​​​​​​"kind": "CredentialProviderRequest", "apiVersion": "credentialprovider.kubelet.k8s.io/v1", "image": "11111111111.dkr.ecr.us-gov-east-1.amazonaws.com/public.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver"}​​​​​​​​​​' | ./ecr-credential-provider

However, if we reference a different image without the public.ecr.aws, it succeeds:

echo '{​​​​​​​​​​​​​​​​​"kind": "CredentialProviderRequest", "apiVersion": "credentialprovider.kubelet.k8s.io/v1", "image": "11111111111.dkr.ecr.us-gov-west-1.amazonaws.com/busybox"}​​​​​​​​​​​​​​​​​' | ./ecr-credential-provider

Our workaround is that when we retag the image, we also modify the string "public.ecr.aws" to become "pub.ecr.aws". A complete example would be: "11111111111.dkr.ecr.us-gov-east-1.amazonaws.com/pub.ecr.aws/ebs-csi-driver/aws-ebs-csi-driver"

Environment:

  • Kubernetes version (use kubectl version): 1.27.x
  • Cloud provider or hardware configuration: AWS GovCloud (US)
  • OS (e.g. from /etc/os-release): RHEL8
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:

/kind bug

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 8, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@cartermckinnon
Copy link
Contributor

This was fixed in #667 and was cherrypicked to release-1.28 in #681. The bug doesn't exist in earlier versions of the ecr-credential-provider, which have no support for ECR Public.

@cartermckinnon
Copy link
Contributor

Correction: the commit that introduced this support (#603) was mistakenly cut into v1.27.2 (that release was tagged incorrectly). It's fixed in v1.27.3.

@cartermckinnon
Copy link
Contributor

/close

@k8s-ci-robot
Copy link
Contributor

@cartermckinnon: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
Development

No branches or pull requests

3 participants