-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CodePipeline: still getting 'Maximum policy size of 10240 bytes exceeded for role' for cross-account pipeline #19835
Comments
@rix0rrr you want to take a look at this one? |
To work around this, we had to do the following:
Just in case it helps anyone else, here is roughly what we ended up with: const pipelineRolePolicyName = `pipelinepolicy-${pipelineName}`;
const pipelineRolePolicy = new iam.Policy(scope, pipelineRolePolicyName, {
policyName: pipelineRolePolicyName,
// These policy statements were built up by hand after examining the
// statements in the CDK-generated policy for several pipelines.
statements: [
// Grants the pipeline access to the s3 artifact buckets.
new iam.PolicyStatement({
effect: iam.Effect.ALLOW,
actions: [
's3:Abort*',
's3:DeleteObject*',
's3:GetBucket*',
's3:GetObject*',
's3:List*',
's3:PutObject',
's3:PutObjectLegalHold',
's3:PutObjectRetention',
's3:PutObjectTagging',
's3:PutObjectVersionTagging',
],
// Note that these bucket names correspond to our self-managed replication buckets, we are
// opting-out of CDK managing those for us.
resources: [`arn:aws:s3:::pipe-artifacts-${pipelineAccount}-*`],
}),
// Grants the pipeline AssumeRole permissions into the various CDK-generated roles, for
// cross-account / cross-region deploys etc.
new iam.PolicyStatement({
effect: iam.Effect.ALLOW,
actions: ['sts:AssumeRole'],
resources: targetAccounts.map(targetAccount => {
// Here we are granting permissions to assume some roles that are generated by CDK
// for use in pipelines. We need to wild-card them so that the policy doesn't get
// too large for IAM. The roles will begin with a string like `cacheadmin-pipeline`,
// unless the name of the pipeline is too long; CDK will truncate the prefix portion
// after 25 chars, and the suffix varies per role, so we just use the first 25 chars.
// NOTE: your pipeline names must all be lower case!
const pipelineRoleNamePrefix = `${pipelineName}-pipeline`
.substr(0, 25);
return `arn:aws:iam::${targetAccount}:role/${pipelineRoleNamePrefix}*`;
}),
}),
],
});
const pipeline = new codepipeline.Pipeline(
this,
`pipeline-${props.pipelineName}`,
{
pipelineName: props.pipelineName,
crossAccountKeys: true,
reuseCrossRegionSupportStacks: true,
// passing the role via `withoutPolicyUpdates` prevents CDK from trying to manually manage the policy. See:
//
// https://github.com/aws/aws-cdk/issues/19835
// https://github.com/aws/aws-cdk/issues/16244#issuecomment-906738231
// https://docs.aws.amazon.com/cdk/api/v1/docs/aws-iam-readme.html#opting-out-of-automatic-permissions-management
role: pipelineRole.withoutPolicyUpdates(),
crossRegionReplicationBuckets: getCrossRegionReplicationBuckets(
this,
supportStackInfoForAllRegions
),
}
);
// We need to enforce a `DependsOn` from the Pipeline to the Policy,
// otherwise CFN will try to create them at the same time, and the Pipeline
// will fail due to missing permissions. However, if we just call the
// main CDK `addDependency` function, it will add our policy as a dependency
// to every resource that the pipeline creates, which will cause circular
// dependency failures. We need to use the "escape hatch" to access the CfnResources
// directly and then add the dependency that way, so that it only gets added to the
// Pipeline and no other resources.
const pipelineCfnResource = pipeline.node.defaultChild as CfnResource;
pipelineCfnResource.addDependsOn(
pipelinePolicy.node.defaultChild as CfnResource
); |
Can you give me a sense of how many stacks we’re talking about? |
Our biggest pipeline right now has 4 stacks, all of which are deployed to 4 different accounts (so, 16 stack deploys). We have about 5 actions per deploy (includes generating ChangeSet, and manual approval). So that's about 80 actions. In the generated policy, the AssumeRole policy statement contains one resource for each action. When we tipped the scales and started hitting the IAM policy size limit there were 97 resources in that policy statement. |
Best solution I can come up with here is to start reusing the same action role for multiple actions. |
Wouldn't it also be possible to split the policy into multiple separate policies, and attach multiple policies to the role? |
The limit is about the final policy document on the Role. Those fragments will all add up to the same total. |
Ah. Then perhaps providing an opt-in flag that would switch the AssumeRole policies to use a wildcard that will match all of the Action roles? |
Two changes: - Collapse CodeBuild action Roles: each CodeBuild step used to create a fresh Role to run the CodeBuild action. Change to use one Role for all CodeBuild actions. This saves a lot of resources and policy space when using a lot of CodeBuild steps, and doesn't appreciably change the security posture of the Pipeline (note: this is *not* about the Execution Role of the CodeBuild projects, this is about the Role assumed by the Pipeline to initiate execution of the Project). - If inline policies grow bigger than 10k, split additional statements off into ManagedPolicies. Since we want to do the splitting post-merging (to get the most bang for our buck), we now need to do statement merging during the `prepare` phase (that is, pre-rendering, instead of post-rendering). That means it had to be modified to work on `PolicyStatement` objects, instead of on raw IAM JSON documents. Closes #19276, closes #19939, closes #19835.
Two changes: - Collapse CodeBuild action Roles: each CodeBuild step used to create a fresh Role to run the CodeBuild action. Change to use one Role for all CodeBuild actions. This saves a lot of resources and policy space when using a lot of CodeBuild steps, and doesn't appreciably change the security posture of the Pipeline (note: this is *not* about the Execution Role of the CodeBuild projects, this is about the Role assumed by the Pipeline to initiate execution of the Project). - If inline policies grow bigger than 10k, split additional statements off into ManagedPolicies. Since we want to do the splitting post-merging (to get the most bang for our buck), we now need to do statement merging during the `prepare` phase (that is, pre-rendering, instead of post-rendering). That means it had to be modified to work on `PolicyStatement` objects, instead of on raw IAM JSON documents. Closes #19276, closes #19939, closes #19835.
(This change has been split off from #20189 because that PR was growing too big) Collapse CodeBuild action Roles: each CodeBuild step used to create a fresh Role to run the CodeBuild action. Change to use one Role for all CodeBuild actions. This saves a lot of resources and policy space when using many CodeBuild steps, and doesn't appreciably change the security posture of the Pipeline (note: this is not about the Execution Role of the CodeBuild projects, this is about the Role assumed by the Pipeline to initiate execution of the Project). Relates to #19276, #19939, #19835.
We add all Role policy statements to the Inline policy, which has a maximums size of 10k. Especially when creating CDK Pipelines that deploy to a lot of environments, the list of Role ARNs the Pipeline should be allowed to assume exceeds this size. Roles also have the ability to have Managed Policies attached (10 by default, 20 with a quota increase), each of them can be 6k in size. By spilling over from inline policies into Managed Policies we can get a total of 70k of statements attached to reach Role. This PR introduces `IComparablePrincipal` to be able to value-compare two principals: since we want to merge first before we split (to get the most bang for our buck), we now need to do statement merging during the prepare phase, while we are still working on the object graph (instead of the rendered CloudFormation template). * That means statement merging had to be modified to work on PolicyStatement objects, which requires being able to compare Principal objects. Closes #19276, closes #19939, closes #19835.
(This change has been split off from #20189 because that PR was growing too big) Collapse CodeBuild action Roles: each CodeBuild step used to create a fresh Role to run the CodeBuild action. Change to use one Role for all CodeBuild actions. This saves a lot of resources and policy space when using many CodeBuild steps, and doesn't appreciably change the security posture of the Pipeline (note: this is not about the Execution Role of the CodeBuild projects, this is about the Role assumed by the Pipeline to initiate execution of the Project). Relates to #19276, #19939, #19835. ---- ### All Submissions: * [ ] Have you followed the guidelines in our [Contributing guide?](https://github.com/aws/aws-cdk/blob/master/CONTRIBUTING.md) ### Adding new Unconventional Dependencies: * [ ] This PR adds new unconventional dependencies following the process described [here](https://github.com/aws/aws-cdk/blob/master/CONTRIBUTING.md/#adding-new-unconventional-dependencies) ### New Features * [ ] Have you added the new feature to an [integration test](https://github.com/aws/aws-cdk/blob/master/INTEGRATION_TESTS.md)? * [ ] Did you use `yarn integ` to deploy the infrastructure and generate the snapshot (i.e. `yarn integ` without `--dry-run`)? *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
|
Describe the bug
We recently filed an issue about how CodePipeline was failing due to the generation of extremely large IAM policy documents:
#19243
Our issue was closed and it was suggested that this PR might fix it:
#19114
This tracking issue was closed via that PR:
#19276
We have since upgraded our CDK version to
2.19.0
and set@aws-cdk/aws-iam:minimizePolicies
totrue
.This did seem to decrease the size of some IAM policies, but we are still hitting the same failure.
Expected Behavior
Expected #19114 to reduce generated IAM policy sizes so that our pipeline would deploy.
Current Behavior
Pipeline still fails to deploy with
Maximum policy size of 10240 bytes exceeded for role xxx
.See attachment for example of the offending IAM policy. The policy contains three statements, and the third one is the offending one. It is granting
AssumeRole
to all of the roles for all of the actions in all of the accounts that this pipeline is trying to deploy to.See attachment for example (sanitized) policy.
pipelineofpipelinesstackcoreinfrastructurepipeline03AEF9CA.template.JUST_OFFENDING_POLICY.sanitized.json.gz
There are 97 resources in that statement, they look like this:
and this:
Reproduction Steps
Create a CodePipeline that creates additional Pipelines. In one of those additional pipelines, create a large number of actions that target multiple AWS accounts.
It would take some non-trivial effort for me to extract a concise reproducer from our current (private) code, but pending the responses to the questions above about whether this is still expected behavior, how to override the Role, etc., I am willing to try to put one together!
Possible Solution
It's not clear to me whether this behavior is still expected after the fix in #19114 was released.
We are looking for guidance on:
Also, there seem to be a ton of other tickets related to this, and in several of them it is hinted that users with advanced use cases like ours may need to opt out of this automatic policy generation, and override the Role with one that we manage ourselves (which might use wildcards or multiple policy attachments, etc.). If that is the current prescription, it would be wonderful if there were some official docs or examples that illustrated how to do this; I haven't really seen anything so far that gives me a concrete idea of how to even attempt it.
Additional Information/Context
No response
CDK CLI Version
2.19.0
Framework Version
2.19.0
Node.js Version
v16.5.0
OS
Amazon Linux 2
Language
Typescript
Language Version
TypeScript 4.4.2
Other information
No response
The text was updated successfully, but these errors were encountered: