-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EKS] Managed worker nodes #139
Comments
As a quick note, can we make sure that this will interact/play nicely with cluster-autoscaler? If we can get managed, autoscaling worker nodes, this would be amazing. |
Along with draining nodes in an upgrade situation. |
Shouldn't this one be solved as part of implementing Fargate for EKS? #32 |
You could use Virtual Kubelet |
Fargate for EKS is a different thing @dnobre, with Fargate there are no worker nodes to manage. So this issue about managing worker nodes is not relevant to Fargate. @tabern being able to support If you provide your own autoscaling instead, it has to be aware on the cluster workload and all ASGs, you’d need to a k8s service or daemonset to provide custom metrics to ASG. And in some way, when there are many ASGs, to choose which one to scale up/down next, as
|
@whereisaaron interesting that's exactly what I would consider Fargate for EKS to be. Since it was never released 2 years ago it's implementation is pretty hypothetical but theoretically I would expect an endpoint for your kubeconfig and be able to deploy via Granted the fact that "Fargate for EKS" was never released means we are all just spit balling here. |
With Fargate, whether ECS Fargate or EKS Fargate there are no worker nodes. That’s why you use a Fargate solution, so you do not have to manage worker nodes. So this issue has no overlap with a Fargate product. @cdenneen not sure I understand, but what you describe sounds correct, just like EKS endpoint, except no (real) worker nodes, just a |
Any updates on this issue? |
@groodt coming soon.... we'll be sure to update when there are updates to share! |
Does this features will add capability to create worker nodes group from the AWS console (UI)? |
@ejlp12 yes. |
@tabern will there be an option to add a userdata script or otherwise modify the instances? |
I am curious about logging aggregation as well for managed workers. Any details on how we can aggregate logs as part of this feature? |
@lilley2412 not at launch, but we plan to add this in the future. @pfremm yes. You'll be able to use EC2 Autoscaling for reporting group-level metrics. Since managed nodes are standard EC2 instances that run in your account, you will be able to implement any log forwarding/aggregation tooling that you are using today, such as FluentBit/S3 and Fluentd/CloudWatch. |
@tabern will this support windows worker nodes? |
Who manages security patches or addresses CVEs on these managed worker nodes. Will this still fall under "Security in the Cloud" customer responsibility? |
Released GA 11/18 👍 |
Can we have a link to the docs? |
Hi! The documentation is deploying now. It should be available shortly, and I'll update with a link here when it is. |
We're excited to announce that Amazon EKS Managed Node Groups are now generally available! With Amazon EKS managed node groups you don’t need to separately provision or connect the EC2 instances that provide compute capacity to run your Kubernetes applications. You can create, update, or terminate nodes for your cluster with a single command. Nodes run using the latest EKS-optimized AMIs in your AWS account while node updates and terminations gracefully drain nodes to ensure your applications stay available. Today, EKS managed node groups are available for new Amazon EKS clusters running Kubernetes version 1.14 with platform version eks.3. You can also update clusters (1.13 or lower) to version 1.14 to take advantage of this feature. Support for existing version 1.14 clusters is coming soon. Learn more |
@tabern congrats on the release! |
Does something need to be done to enable this on existing clusters? Latest EKS 1.14 Also, as @nxf5025 mentions, doesn't look like any ability to pass in userdata or kubelet flags? Also, will there be support for spot instances? |
Thanks all! We're pretty excited to introduce this new feature. @robgott @pc-rshetty CloudFormation support for managed node groups is there today, its just that the documentation is taking a bit longer to publish than we had originally expected. Specifically, EKS Managed node group introduces a new resource type ”AWS::EKS::Nodegroup“ and an update to existing resource type ”AWS::EKS::Cluster“ to add ClusterSecurityGroupId in Cloudformation. The documentation updates for these changes will be published by 11/21. @pc-rshetty Cluster Autoscaler should continue to work just like it does today. The biggest change from our end is that we tag every node for auto discovery by cluster autoscaler. Overprovisioner should work. Seems like a helm chart that basically implements the method described here? @nxf5025 @MarcusNoble today you cannot pass this to managed node groups. However! we're planning to add this in the future as part of support for EC2 Launch Templates #585 Yes, we also will be working on spot support - tracking in #583 The other feature we're currently tracking on the roadmap is Windows Support (#584) but feel free to add more if there are important features you think we should be looking at. |
Are managed Ubuntu node groups also being worked on or should that be added to the roadmap? That was mentioned in the blog post when comparing EKS API with eksctl, it's a feature we need. |
In addition to spot instances, being able to utilise mixed instances policy, as per kubernetes/autoscaler#1886, i.e. t3.large and t3a.large or m5.large and m5d.large etc. This is to increase the probability of a successful instance fulfilment. We are currently using this functionality to good effect and would need to have the same ability with managed worker nodes, along with the ability to specify userdata. In the UI, this would be simply represented by being able to select multiple instance types and preferably being able to sort them in order of preference. This is how launch template mixed instances policy and overrides currently work: |
Just one question can this feature utilise spot instances? Could not find it in documentation though
Sent from my mobile. Typos are possible!
… On 19 Nov 2019, at 08:27, Andrew Hemming ***@***.***> wrote:
In addition to spot instances, being able to utilise mixed instances policy, as per kubernetes/autoscaler#1886, i.e. t3.large and t3a.large or m5.large and m5d.large etc. This is to increase the probability of a successful instance fulfilment. We are currently using this functionality to good effect and would need to have the same ability with managed worker nodes, along with the ability to specify userdata.
In the UI, this would be simply represented by being able to select multiple instance types and preferably being able to sort them in order of preference. This is launch template mixed instances policy and overrides currently work:
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-autoscaling-autoscalinggroup-launchtemplateoverrides.html
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-autoscaling-autoscalinggroup-launchtemplate.html
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Cloudformation Documentation for EKS Managed Node Groups is now published - https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-eks-nodegroup.html |
@tabern Doesn't look like docs have been updated still. That link redirects me to: https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/Welcome.html |
Interesting. The link worked when I clicked it 12 hours ago... |
It's working for me now! 👍 |
@tabern How are rolling updates supposed to work? Draining nodes actually works, but apparently leads to a downtime. Let's say I have an existing node group and want to rotate the nodes. To do this (manually), I would replace the node group by creating a new one, waiting for it to become available and then delete the old one afterwards. When doing this, I can actually see that the nodes get drained before the instances get terminated. However, the running pods are more or less terminated simultaneously which leads to a downtime. In terraform, the mechanism is basically the same, leading to the same result. Am I doing something wrong? Edit: 2nd edit: |
@splieth look into pod disruption budgets, that's what you need to avoid all pods terminate at once. |
When will we see support for existing 1.14 clusters? My clusters are currently stuck on platform version |
@groodt I don't think there's much hope there. You could just create a new cluster and move your workloads there. |
I don't see why it couldn't support existing clusters. A little more involved maybe but a cluster can have multiple ASGs associated with it so the new managed nodes could be brought up alongside the existing self-managed and them remove the self-managed when the new nodes are stable. |
This isn't a new Kubernetes version. Presumably it's some additional process running in the control plane that are aware of the ASGs and that's it. I saw this in the original announcement:
So presumably they do plan to upgrade existing clusters, I'm curious on the timelines. If it's too long, sure I can create a new cluster and migrate workloads easily enough, but it's still annoying to do without downtime. |
My 1.14 clusters are still stuck in platform version |
FWIW, my clusters weren’t updating either, but when I updated all my workers to a newer AMI ahead of the control plane version, all my control planes updated within 48 hours. Coincidence? |
Been trying out managed worker nodes and unless I am missing something do I have no ability to see kubelet related logs unless I provision with an SSH key? |
@pfremm I didn't find another method apart from deploying the SSM agent as DaemonSet and accessing the logs via SSM rather than SSH. But imho that's the better option since the SSH key doesn't need to be shared |
@pfremm I suggest you set up container insights to ship logs and metrics into CloudWatch Logs, that worked great for me. All logs from pods and kubelet + kube proxy are then shipped and viewable in cloudwatch. You can then ship that further into elasticsearch as well, so that's also an option 8f you don't like cloudwatch queries. |
Is there any update on this? I couldn't find any reference in https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-eks-nodegroup.html This is quite critical to setup nodes for different purposes |
Out of curiosity, does the current managed node group setup specify any kind of flags for |
is there any support for setting taints on a nodegroup? |
I am not able to scale down the Managed Node group. I had deployed the 2 nodes cluster (Managed node group) and was able to scale the number of nodes from 2 to 3 but scaling down doesn't work. I tried from AWS managed console and CF template but no luck. On scaling down the nodes, the node does go into the scheduling disabled state but the work load is not evicted from that node and eventually, the node becomes ready. For scaling up or down the managed node group, I am only updating the scaling config of the Managed node group. NAME STATUS ROLES AGE VERSION |
Managed Kubernetes worker nodes will allow you to provision, scale, and update groups of EC2 worker nodes through EKS.
This feature fulfills #57
EKS Managed Node Groups are now GA!
The text was updated successfully, but these errors were encountered: