Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat): add workload identity in capz #3583

Merged

Conversation

sonasingh46
Copy link
Contributor

@sonasingh46 sonasingh46 commented May 21, 2023

What type of PR is this?
This PR enables Workload Identity capability in capz.

What this PR does / why we need it:
Implementation of proposal #2814
Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #3588
Partly fixes #2205

Special notes for your reviewer:

  • cherry-pick candidate

TODOs:

  • squashed commits
  • includes documentation
  • adds unit tests

Release note:

add support for workload identity in capz

@k8s-ci-robot k8s-ci-robot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 21, 2023
@k8s-ci-robot k8s-ci-robot requested review from devigned and Jont828 May 21, 2023 12:08
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels May 21, 2023
@sonasingh46 sonasingh46 force-pushed the add_workload_identity_feature branch from 7b69287 to 093f50e Compare May 21, 2023 12:13
@codecov-commenter
Copy link

codecov-commenter commented May 21, 2023

Codecov Report

Patch coverage: 17.39% and project coverage change: +0.11 🎉

Comparison is base (54d68db) 53.74% compared to head (7cc5115) 53.86%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3583      +/-   ##
==========================================
+ Coverage   53.74%   53.86%   +0.11%     
==========================================
  Files         185      186       +1     
  Lines       18595    18768     +173     
==========================================
+ Hits         9994    10109     +115     
- Misses       8059     8116      +57     
- Partials      542      543       +1     
Impacted Files Coverage Δ
api/v1beta1/types.go 60.71% <ø> (ø)
azure/scope/identity.go 37.20% <0.00%> (-1.63%) ⬇️
azure/scope/workload_identity.go 0.00% <0.00%> (ø)
api/v1beta1/azureclusteridentity_webhook.go 71.42% <100.00%> (+51.42%) ⬆️

... and 7 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@sonasingh46 sonasingh46 force-pushed the add_workload_identity_feature branch from 093f50e to 4bca1fc Compare May 21, 2023 12:26
@sonasingh46
Copy link
Contributor Author

/test pull-cluster-api-provider-azure-e2e-optional

@sonasingh46
Copy link
Contributor Author

/test pull-cluster-api-provider-azure-e2e-optional

@sonasingh46 sonasingh46 force-pushed the add_workload_identity_feature branch 2 times, most recently from 6675f0c to 65bd725 Compare May 22, 2023 22:45
@sonasingh46
Copy link
Contributor Author

/test pull-cluster-api-provider-azure-e2e-optional

Copy link
Contributor

@CecileRobertMichon CecileRobertMichon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall lgtm, thank you for splitting this out!

@sonasingh46 sonasingh46 force-pushed the add_workload_identity_feature branch from 65bd725 to 49e1b0b Compare May 24, 2023 04:17
}
w.TokenFilePath = tokenFilePath

// Fallback to using client ID from env variable if not set.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the workaround for #3409?

I'm looking at that but not sure what needs to be fixed (yet).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.
It is still allowed to leave client ID and tenant ID fields on AzureClusterIdentity object to be empty. So this is fallback for cases where those fields are not set on AzureClusterIdentity.

So the behaviour is like this:

  • CAPZ uses the tenant Id and the client ID as specified on AzureClusterIdentity.
  • If client ID is not specified, it tries to read from the env.
  • If tenant ID is not specified, it tries to read from the env.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not possible to make client ID and tenant ID mandatory if identity type is WorkloadIdentity? IMO having fallbacks to env vars could cause confusion and requires additional documentation on the behavior when both CRD and env vars are configured.

Copy link
Contributor Author

@sonasingh46 sonasingh46 May 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of things here:

  1. AzureClusterIdentity is referenced by AzureCluster.
  2. So if I want to create a workload cluster by using/referencing a AzureClusterIdentity capz should always use the IDs from it.

Also, I do not know why client ID and tenant ID are optionally required and not strictly required on AzureClusterIdentity.

So I am not sure if we should make a special case for workload identity, though it is possible to do so.

Having said that, if I leave the workload identity discussion aside, then still it makes more sense to use IDs present on AzureClusterIdentity. (e.g ManualServicePrincipal )

I am also thinking about the following in deciding the priority:

  1. Read IDs from AzureClusterIdentity and fall back to read ID from env if not present on it. Or
  2. Read IDs from env and fall back to using IDs present on AzureClusterIdentity.

Here [2] looks semantically confusing/incorrect to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make sense to make client ID and tenant ID required, AFAIK there is not Identity type that doesn't need them. Can we create an issue and do this as a follow up?

azure/scope/workload_identity.go Outdated Show resolved Hide resolved
azure/scope/workload_identity.go Outdated Show resolved Hide resolved
azure/scope/workload_identity.go Outdated Show resolved Hide resolved
azure/scope/workload_identity.go Outdated Show resolved Hide resolved
config/manager/manager.yaml Outdated Show resolved Hide resolved
}
w.TokenFilePath = tokenFilePath

// Fallback to using client ID from env variable if not set.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it not possible to make client ID and tenant ID mandatory if identity type is WorkloadIdentity? IMO having fallbacks to env vars could cause confusion and requires additional documentation on the behavior when both CRD and env vars are configured.

@sonasingh46 sonasingh46 force-pushed the add_workload_identity_feature branch from 49e1b0b to 26975ea Compare June 20, 2023 08:50
@sonasingh46
Copy link
Contributor Author

/test pull-cluster-api-provider-azure-e2e-optional

Copy link
Contributor

@CecileRobertMichon CecileRobertMichon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @mboersma

@k8s-ci-robot k8s-ci-robot requested a review from mboersma June 20, 2023 18:48
@mboersma
Copy link
Contributor

mboersma commented Jul 6, 2023

/milestone v1.10

@k8s-ci-robot k8s-ci-robot added this to the v1.10 milestone Jul 6, 2023
@@ -169,7 +169,11 @@ E2E_CONF_FILE ?= $(ROOT_DIR)/test/e2e/config/azure-dev.yaml
E2E_CONF_FILE_ENVSUBST := $(ROOT_DIR)/test/e2e/config/azure-dev-envsubst.yaml
SKIP_CLEANUP ?= false
SKIP_LOG_COLLECTION ?= false
SKIP_CREATE_MGMT_CLUSTER ?= false
# @sonasingh46: Skip creating mgmt cluster for ci as workload identity needs kind cluster
# to be created with extra mounts for key pairs which is not yet supported
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with extra mounts for key pairs which is not yet supported by existing e2e framework

did you open an issue in CAPI to add this support?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not file it yet. I will file that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@CecileRobertMichon CecileRobertMichon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder to squash before merge

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 6, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 5f307cb6c4ca69db3a5c37049b9193be98a62d5f

@mboersma
Copy link
Contributor

mboersma commented Jul 7, 2023

Same here, looks good but please squash the commits.

@sonasingh46 sonasingh46 force-pushed the add_workload_identity_feature branch from 7cc5115 to 1e8894c Compare July 7, 2023 19:19
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 7, 2023
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 8, 2023
@sonasingh46 sonasingh46 force-pushed the add_workload_identity_feature branch from 1e8894c to 5f39919 Compare July 10, 2023 12:21
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 10, 2023
@sonasingh46
Copy link
Contributor Author

/test pull-cluster-api-provider-azure-e2e-optional

Copy link
Contributor

@mboersma mboersma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Let's create the couple of GitHub issues mentioned in review comments, then I'm happy to approve this.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 10, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: ee587a12aa7c6454ed6db4650935832ffec7feec

@sonasingh46
Copy link
Contributor Author

@mboersma -- I created one here
kubernetes-sigs/cluster-api#8983

@CecileRobertMichon
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: CecileRobertMichon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 10, 2023
@mboersma
Copy link
Contributor

/override coverage

@k8s-ci-robot
Copy link
Contributor

@mboersma: /override requires failed status contexts, check run or a prowjob name to operate on.
The following unknown contexts/checkruns were given:

  • coverage

Only the following failed contexts/checkruns were expected:

  • EasyCLA
  • deploy/netlify
  • pull-cluster-api-provider-azure-apidiff
  • pull-cluster-api-provider-azure-apiversion-upgrade
  • pull-cluster-api-provider-azure-build
  • pull-cluster-api-provider-azure-capi-e2e
  • pull-cluster-api-provider-azure-ci-entrypoint
  • pull-cluster-api-provider-azure-conformance
  • pull-cluster-api-provider-azure-conformance-with-ci-artifacts
  • pull-cluster-api-provider-azure-e2e
  • pull-cluster-api-provider-azure-e2e-aks
  • pull-cluster-api-provider-azure-e2e-optional
  • pull-cluster-api-provider-azure-test
  • pull-cluster-api-provider-azure-verify
  • pull-cluster-api-provider-azure-windows-containerd-upstream-with-ci-artifacts
  • tide

If you are trying to override a checkrun that has a space in it, you must put a double quote on the context.

In response to this:

/override coverage

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@CecileRobertMichon
Copy link
Contributor

@mboersma the PR might need a rebase to get past the codecov changes since we moved to the app now? cc @willie-yao

@mboersma
Copy link
Contributor

mboersma commented Jul 10, 2023

@sonasingh46 force-pushed it 6 hours ago, was it not rebased on main? I agree that might be the only way to get prow to merge this.

@mboersma
Copy link
Contributor

mboersma commented Jul 10, 2023

There's no rebasing to be done here AFAICT, it's already based on the last commit from July 9 (yesterday).

Edit: I just had to run the codecov action again. 🤦🏻

Congratulations @sonasingh46!! Great work, and thanks for your dedication!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

add workload identity feature in capz Migrate from AAD pod identity to Azure Workload Identity
6 participants