Issue reconciling node routes with self hosted canal #1336

cknowles · 2018-05-27T10:31:32Z

I've been trying to use self hosted canal enabled by default using version 9c735f9. k8s v1.9.7. It seems now I'm getting these logs from the controller manager although the cluster itself appears to be working still.

kube-controller-manager-ip-10-0-22-230.eu-west-1.compute.internal kube-controller-manager E0527 10:17:37.615993 1 route_controller.go:116] Couldn't reconcile node routes: error listing routes: found multiple matching AWS route tables for AWS cluster: kubernetes

The only changes made between this and previous cluster are switching self hosting on plus the separate etcd stack that comes with the latest version. It's an entirely new cluster and I've deleted the previous version of the cluster.

We do have another cluster in the same VPC but this has never been a problem before.

I had a look through every route table and subnet, all seem tagged appropriately. I was looking for potentially related issues and found kubernetes/kubernetes#12449 (comment).

Anyone else having similar issues?

The text was updated successfully, but these errors were encountered:

cknowles · 2018-05-27T14:54:34Z

Just trying out the --cluster-name flag for controller manager, it seems we've never set it in kube-aws.

cknowles · 2018-05-28T05:15:19Z

@davidmccormick, I wondered if you had any ideas about this or seeing similar in your clusters? It appears the cluster still works which is odd but I'm concerned running another cluster in the same VPC will break them. The cluster name flag above did not fix it. Our other cluster in the same VPC appears to be working fine and has a similar setup. The differences are k8s 1.7.x to 1.9.x and the self hosted network enable. I'm not sure if the --allocate-node-cidrs and --cluster-cidr flags of controller manager could affect this, I'm still digging into the controller manager code.

davidmccormick · 2018-05-28T10:09:49Z

Hi, sorry for the slow reply! Yes I see these messages on my clusters too and have also seen it before and thought it benign. What I think is happening is that the AWS cloud plugin is trying to update the routing tables - which is only required for the AWS native networking backend which we are not using because we are using flannel (the AWS backend is limited by routing table entries). Previously there was no way to turn this off, I don’t know if that has now changed in the AWS plugin. I’m out of the office this week but happy to have more of a look next week.

cknowles · 2018-05-28T10:14:26Z

@davidmccormick no worries, thanks for the confirmation. I was thinking it might be benign also considering that the cluster still appears to work. I am wondering though why flannel on the 1.7.3 cluster in the same VPC works without these errors, kube-aws is generating the subnets and route tables in both cases. I was checking upstream and the code appears to be the same for controller manager. So best I can determine right now is this is introduced when I switched on the self hosted network just can't see why that would be.

davidmccormick · 2018-05-28T10:44:27Z

Without looking at the code I would guess as you did that the change is in being asked to allocate node cidrs that it wasn’t doing under the legacy flannel install.

cknowles · 2018-05-28T10:50:37Z

Seems like the controller manager flag --configure-cloud-routes and kubernetes/kubernetes#25602 are relevant. I just set this flag to false on the dev cluster in question and the message has disappeared plus the cluster appears to all be functionality still. My understanding is not deep enough to say whether we should be setting that flag to false or not for self hosted network.

davidmccormick · 2018-05-28T11:02:47Z

Go spot - that sounds good! We don't need cloud routes with flannel - look like you have found the off switch for it!

…

On 28 May 2018 at 11:50, Chris Knowles ***@***.***> wrote: Seems like the controller manager flag --configure-cloud-routes and kubernetes/kubernetes#25602 <kubernetes/kubernetes#25602> are relevant. I just set this flag to false on the dev cluster in question and the message has disappeared plus the cluster appears to all be functionality still. My understanding is not deep enough to say whether we *should* be setting that flag to false or not for self hosted network. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1336 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABYvcpC3OGhQRCK5ZxMfmDFAw_LDsQgQks5t29Z-gaJpZM4UPOg0> .

Fixes kubernetes-retired#1336 by adding `--configure-cloud-routes=false` when `--allocate-node-cidrs=true` Added `--cluster-name` to ensure it's set correctly on controller manager. Grouped mandatory and optional flags together.

cknowles mentioned this issue May 28, 2018

Disable configure of cloud route when allocate-node-cidrs is enabled #1341

Merged

mumoshu closed this as completed in #1341 Jun 2, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue reconciling node routes with self hosted canal #1336

Issue reconciling node routes with self hosted canal #1336

cknowles commented May 27, 2018 •

edited

Loading

cknowles commented May 27, 2018

cknowles commented May 28, 2018

davidmccormick commented May 28, 2018

cknowles commented May 28, 2018 •

edited

Loading

davidmccormick commented May 28, 2018

cknowles commented May 28, 2018

davidmccormick commented May 28, 2018 via email

Issue reconciling node routes with self hosted canal #1336

Issue reconciling node routes with self hosted canal #1336

Comments

cknowles commented May 27, 2018 • edited Loading

cknowles commented May 27, 2018

cknowles commented May 28, 2018

davidmccormick commented May 28, 2018

cknowles commented May 28, 2018 • edited Loading

davidmccormick commented May 28, 2018

cknowles commented May 28, 2018

davidmccormick commented May 28, 2018 via email

cknowles commented May 27, 2018 •

edited

Loading

cknowles commented May 28, 2018 •

edited

Loading