Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(aws-eks): IPv6 support for EKS Clusters #18423

Closed
1 of 2 tasks
youngjeong46 opened this issue Jan 14, 2022 · 2 comments · Fixed by #25819
Closed
1 of 2 tasks

(aws-eks): IPv6 support for EKS Clusters #18423

youngjeong46 opened this issue Jan 14, 2022 · 2 comments · Fixed by #25819
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service effort/small Small work item – less than a day of effort feature-request A feature should be added or improved. p2

Comments

@youngjeong46
Copy link

youngjeong46 commented Jan 14, 2022

Description

I'm requesting IPv6 support for EKS Clusters provisioned with CDK.

Use Case

AWS recently announced support for IPv6 on EKS, focused on resolving the IP exhaustion problem, which is constrained by the limited size of the IPv4 address space, a significant concern as many scale their EKS applications. eksctl already supports cluster creation with IPv6

Proposed Solution

Current eks.Cluster construct prop includes serviceIpv4Cidr? parameter. Replace that parameter with these:

  • ipv4: boolean - if true, then IPv4; if false, then IPv6
  • serviceCidr? - The CIDR block to assign Kubernetes service IP addresses from. Should catch errors for invalid set (ipv4 should be provided with ipv4 range, and ipv6 with ipv6 range)

Other information

No response

Acknowledge

  • I may be able to implement this feature request
  • This feature might incur a breaking change
@youngjeong46 youngjeong46 added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Jan 14, 2022
@github-actions github-actions bot added the @aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service label Jan 14, 2022
@otaviomacedo otaviomacedo added effort/small Small work item – less than a day of effort p2 and removed needs-triage This issue or PR still needs to be triaged. labels Jan 24, 2022
@otaviomacedo otaviomacedo removed their assignment Jan 24, 2022
@jagu-sayan
Copy link

Hello :)

Maybe do the same that EKS API and CloudFormation ?

According to the API, we can't specify a custom IPv6 CIDR block.

Proposed Solution:

  /**
   * Specify which IP family is used to assign Kubernetes pod and service IP addresses.
   *
   * @default - ipv4
   * @see https://docs.aws.amazon.com/eks/latest/APIReference/API_KubernetesNetworkConfigRequest.html#AmazonEKS-Type-KubernetesNetworkConfigRequest-ipFamily
   */
   readonly ipFamily?: string;

If both ipFamily and serviceIpv4Cidrare defined, we throw an error.

wlami pushed a commit to wlami/aws-cdk that referenced this issue Mar 13, 2022
@mergify mergify bot closed this as completed in #25819 Jun 8, 2023
mergify bot pushed a commit that referenced this issue Jun 8, 2023
## Description

This change enables IPv6 for EKS clusters

## Reasoning

* IPv6-based EKS clusters will enable service owners to minimize or even eliminate the perils of IPv4 CIDR micromanagement
* IPv6 will enable very-large-scale EKS clusters
* My working group ( Amazon SDO/ECST ) recently attempted to enable IPv6 using L1 Cfn EKS constructs, but failed after discovering a CDKv2 issue which results in a master-less EKS cluster.  Rather than investing in fixing this interaction we agreed to contribute to aws-eks (this PR)

## Design

* This change treats IPv4 as the default networking configuration
* A new enum `IpFamily` is introduced to direct users to specify `IP_V4` or `IP_V6`
* ~~This change adds a new Sam layer dependency~~ Dependency removed after validation it was no longer necessary

## Testing

I consulted with some team members about how to best approach testing this change, and I concluded that I should duplicate the eks-cluster test definition.  I decided that this was a better approach than redefining the existing cluster test to use IPv6 for a couple of reasons:

1. EKS still requires IPv4 under the hood
2. IPv6 CIDR and subnet association isn't exactly straightforward.  My example in eks-cluster-ipv6 is the simplest one I could come up with
3. There's additional permissions and routing configuration that's necessary to get the cluster tests to succeed.  The differences were sufficient to motivate splitting out the test, in my opinion.

I ran into several issues running the test suite, primarily related to out-of-memory conditions which no amount of RAM appeared to help.  `NODE_OPTIONS--max-old-space-size=8192` did not improve this issue, nor did increasing it to 12GB.  Edit: This ended up being a simple fix, but annoying to dig out.  The fix is `export NODE_OPTIONS=--max-old-space-size=8192`.  Setting this up in my .rc file did not stick, either.  MacOS Ventura for those keeping score at home.

The bulk of my testing was performed using a sample stack definition (below), but I was unable to run the manual testing described in `aws-eks/test/MANUAL_TEST.md` due to no access to the underlying node instances.  Edit, I can run the MANUAL_TESTS now if that's deemed necessary.  

Updated:  This sample stack creates an ipv6 enabled cluster with an example nginx service running.  

Sample:

```ts
import {
  App, Duration, Fn, Stack,
  aws_ec2 as ec2,
  aws_eks as eks,
  aws_iam as iam,
} from 'aws-cdk-lib';
import { getClusterVersionConfig } from './integ-tests-kubernetes-version';

const app = new App();
const env = { region: 'us-east-1', account: '' };
const stack = new Stack(app, 'my-v6-test-stack-1', { env });

const vpc = new ec2.Vpc(stack, 'Vpc', { maxAzs: 3, natGateways: 1, restrictDefaultSecurityGroup: false });
const ipv6cidr = new ec2.CfnVPCCidrBlock(stack, 'CIDR6', {
  vpcId: vpc.vpcId,
  amazonProvidedIpv6CidrBlock: true,
});

let subnetcount = 0;
let subnets = [...vpc.publicSubnets, ...vpc.privateSubnets];
for ( let subnet of subnets) {
  // Wait for the ipv6 cidr to complete
  subnet.node.addDependency(ipv6cidr);
  _associate_subnet_with_v6_cidr(subnetcount, subnet);
  subnetcount++;
}

const roles = _create_roles();

const cluster = new eks.Cluster(stack, 'Cluster', {
  ...getClusterVersionConfig(stack),
  vpc: vpc,
  clusterName: 'some-eks-cluster',
  defaultCapacity: 0,
  endpointAccess: eks.EndpointAccess.PUBLIC_AND_PRIVATE,
  ipFamily: eks.IpFamily.IP_V6,
  mastersRole: roles.masters,
  securityGroup: _create_eks_security_group(),
  vpcSubnets: [{ subnets: subnets }],
});

// add a extra nodegroup
cluster.addNodegroupCapacity('some-node-group', {
  instanceTypes: [new ec2.InstanceType('m5.large')],
  minSize: 1,
  nodeRole: roles.nodes,
});

cluster.kubectlSecurityGroup?.addEgressRule(
  ec2.Peer.anyIpv6(), ec2.Port.allTraffic(),
);

// deploy an nginx ingress in a namespace
const nginxNamespace = cluster.addManifest('nginx-namespace', {
  apiVersion: 'v1',
  kind: 'Namespace',
  metadata: {
    name: 'nginx',
  },
});

const nginxIngress = cluster.addHelmChart('nginx-ingress', {
  chart: 'nginx-ingress',
  repository: 'https://helm.nginx.com/stable',
  namespace: 'nginx',
  wait: true,
  createNamespace: false,
  timeout: Duration.minutes(5),
});

// make sure namespace is deployed before the chart
nginxIngress.node.addDependency(nginxNamespace);

function _associate_subnet_with_v6_cidr(count: number, subnet: ec2.ISubnet) {
  const cfnSubnet = subnet.node.defaultChild as ec2.CfnSubnet;
  cfnSubnet.ipv6CidrBlock = Fn.select(count, Fn.cidr(Fn.select(0, vpc.vpcIpv6CidrBlocks), 256, (128 - 64).toString()));
  cfnSubnet.assignIpv6AddressOnCreation = true;
}

export function _create_eks_security_group(): ec2.SecurityGroup {
  let sg = new ec2.SecurityGroup(stack, 'eks-sg', {
    allowAllIpv6Outbound: true,
    allowAllOutbound: true,
    vpc,
  });
  sg.addIngressRule(
    ec2.Peer.ipv4('10.0.0.0/8'), ec2.Port.allTraffic(),
  );
  sg.addIngressRule(
    ec2.Peer.ipv6(Fn.select(0, vpc.vpcIpv6CidrBlocks)), ec2.Port.allTraffic(),
  );
  return sg;
}

export namespace Kubernetes {
  export interface RoleDescriptors {
    masters: iam.Role,
    nodes: iam.Role,
  }
}

function _create_roles(): Kubernetes.RoleDescriptors {
  const clusterAdminStatement = new iam.PolicyDocument({
    statements: [new iam.PolicyStatement({
      actions: [
        'eks:*',
        'iam:ListRoles',
      ],
      resources: ['*'],
    })],
  });

  const eksClusterAdminRole = new iam.Role(stack, 'AdminRole', {
    roleName: 'some-eks-master-admin',
    assumedBy: new iam.AccountRootPrincipal(),
    inlinePolicies: { clusterAdminStatement },
  });

  const assumeAnyRolePolicy = new iam.PolicyDocument({
    statements: [new iam.PolicyStatement({
      actions: [
        'sts:AssumeRole',
      ],
      resources: ['*'],
    })],
  });

  const ipv6Management = new iam.PolicyDocument({
    statements: [new iam.PolicyStatement({
      resources: ['arn:aws:ec2:*:*:network-interface/*'],
      actions: [
        'ec2:AssignIpv6Addresses',
        'ec2:UnassignIpv6Addresses',
      ],
    })],
  });

  const eksClusterNodeGroupRole = new iam.Role(stack, 'NodeGroupRole', {
    roleName: 'some-node-group-role',
    assumedBy: new iam.ServicePrincipal('ec2.amazonaws.com'),
    managedPolicies: [
      iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonEKSWorkerNodePolicy'),
      iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonEC2ContainerRegistryReadOnly'),
      iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonEKS_CNI_Policy'),
      iam.ManagedPolicy.fromAwsManagedPolicyName('AmazonSSMManagedInstanceCore'),
      iam.ManagedPolicy.fromAwsManagedPolicyName('CloudWatchAgentServerPolicy'),
    ],
    inlinePolicies: {
      assumeAnyRolePolicy,
      ipv6Management,
    },
  });

  return { masters: eksClusterAdminRole, nodes: eksClusterNodeGroupRole };
}
```

## Issues

Edit: Fixed

Integration tests, specifically the new one I contributed, failed with an issue in describing a Fargate profile:

```
2023-06-01T16:24:30.127Z    6f9b8583-8440-4f13-a48f-28e09a261d40    INFO    {
    "describeFargateProfile": {
        "clusterName": "Cluster9EE0221C-f458e6dc5f544e9b9db928f6686c14d5",
        "fargateProfileName": "ClusterfargateprofiledefaultEF-1628f1c3e6ea41ebb3b0c224de5698b4"
    }
}
---------------------------
2023-06-01T16:24:30.138Z    6f9b8583-8440-4f13-a48f-28e09a261d40    INFO    {
    "describeFargateProfileError": {}
}
---------------------------
2023-06-01T16:24:30.139Z    6f9b8583-8440-4f13-a48f-28e09a261d40    ERROR    Invoke Error     {
    "errorType": "TypeError",
    "errorMessage": "getEksClient(...).describeFargateProfile is not a function",
    "stack": [
        "TypeError: getEksClient(...).describeFargateProfile is not a function",
        "    at Object.describeFargateProfile (/var/task/index.js:27:51)",
        "    at FargateProfileResourceHandler.queryStatus (/var/task/fargate.js:83:67)",
        "    at FargateProfileResourceHandler.isUpdateComplete (/var/task/fargate.js:49:35)",
        "    at FargateProfileResourceHandler.isCreateComplete (/var/task/fargate.js:46:21)",
        "    at FargateProfileResourceHandler.isComplete (/var/task/common.js:31:40)",
        "    at Runtime.isComplete [as handler] (/var/task/index.js:50:21)",
        "    at Runtime.handleOnceNonStreaming (/var/runtime/Runtime.js:74:25)"
    ]
}
```

I am uncertain if this is an existing issue or one introduced by this change, or something related to my local build.  Again, I had abundant issues related to building aws-cdk and the test suites depending on Jupiter's position in the sky.

## Collaborators 
Most of the work in this change was performed by @wlami and @jagu-sayan (thank you!)

Fixes #18423

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@github-actions
Copy link

github-actions bot commented Jun 8, 2023

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-eks Related to Amazon Elastic Kubernetes Service effort/small Small work item – less than a day of effort feature-request A feature should be added or improved. p2
Projects
None yet
3 participants