-
Notifications
You must be signed in to change notification settings - Fork 238
Validate Agent and Pod are on the same Node #105
Comments
This would be great! I think validating that requesting IP of the agent is running the Pod it is asking for by checking the k8s API agrees would make things much more secure. In my eyes, this is the major security benefit of the agent/server model. I guess checking that the Pod is running is a good start. But if a compromised agent means that I can get any credentials for anything that is running in the cluster then it's not a lot better than giving the agents the ability to do |
This seems very closely related to, or maybe even the same as, one of my main concerns with kiam. I want a kiam server to not hand out role credentials to a kiam agent if the node that that kiam agent is running on doesn't have a pod that's annotated for that role. The more I turn this issue over in my head, the more I think it's the same as what I want, just worded slightly differently. I think I had in my head that the server would consider the role being asked for, and check the list of Pods on that agent's node to see if it should be allowed; but this issue is saying instead that the server would consider the Pod that is asking for the role through an agent, and verify that that agent and that Pod are on the same node. I think those are functionally the same? Does that sound the same to everyone else? Will a kiam server not hand out role credentials unless a pod name is specified in the request? Or is just the name of the role sufficient for credentials to be returned? I'll be reviewing the source code over the next few days and expect I'll discover this myself soon. |
It looks like, as long as you have a cert and can connect to the kiam GRPC service, the only 2 pieces of information you need are an IP and Role. I agree it would be cool if when it looked up the pod for that IP, it inspected the node somehow and correlated it from the incoming Peer.Addr in the context. |
This is important to my organization. We have some changes internally that implement it. However:
It turns out that if you run Calico networking, and you have Calico's IP-in-IP encapsulation turned on, then the GRPC peer information contains a Calico tunnel IP address, rather than the IP address of the kiam agent. Our Kubernetes cluster happens to be in one subnet, so we reduced Calico's IP-in-IP encapsulation setting to cross-subnet, which effectively disabled it, and that got our changes working. However, plainly this means that the approach of checking the GRPC peer information is not 100% reliable, and depends at least on your overlay network implementation. Given that, would you be willing to accept a PR which:
We would not commit to enhancing the feature any more than that. Other possible implementationskiam-agent verification pluginA possible option for moving forward would be to have a plugin framework for verifying the kiam-agent. I'm envisioning that the plugins would run as optional sidecar containers next to the server, and the server would webhook out to them via HTTP or gRPC. It would pass the GRPC peer information, and get back as the response a Node IP address corresponding to that information. If no plugin is specified, then the server assumes the GRPC peer IP should be used as-is. Calico's plugin implementation would look up the tunnel address in the Calico datastore and translate it into a Node address. If the verification is done based on GRPC peer information, then somehow translating this tunnel address would be the only way that one kiam deployment could serve a Kubernetes cluster with Calico networking that spans multiple subnets. TLS-based implementationIf each kiam agent had a unique TLS cert, then there might be a way to do this validation with TLS authorization instead of by looking at the GRPC peer information. In our deployment we currently have the agents all sharing one cert, and the servers all sharing another. You would have to ensure that nothing could get a cert spoofing an existing kiam-agent, though. |
This is an improvement to reduce the ability to further exploit credentials when a node is compromised. I acknowledge node compromise is already a significant event.
However GRPC has Peer information available in context so this should be a relatively light lift since we have the Pod information already.
This will enable teams to further put defense in depth by controlling pod placement, again with the k8s constraints this is not a declarative solve, but an improvement.
The text was updated successfully, but these errors were encountered: