Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternatives approaches to supporting ExternalName services #3950

Closed
dprotaso opened this issue Aug 12, 2021 · 14 comments
Closed

Alternatives approaches to supporting ExternalName services #3950

dprotaso opened this issue Aug 12, 2021 · 14 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. knative lifecycle/needs-triage Indicates that an issue needs to be triaged by a project contributor. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@dprotaso
Copy link

dprotaso commented Aug 12, 2021

Problem

The enableExternalNameService config being defaulted to false is a breaking change for users

Now that there is a PR mitigating envoy admin console access - #3934 we should consider alternatives/modifications to that flag.

# ExternalName Services are disabled by default due to CVE-2021-XXXXX
# You can re-enable them by setting this setting to `true`.
# This is not recommended without understanding the security implications.
# Please see the advisory at https://github.com/projectcontour/contour/security/advisories/GHSA-5ph6-qq5x-7jwc for the details.
# enableExternalNameService: false

Potential Alternatives

1. Remove the flag

By removing the flag we would defer the responsibility of who can create ExternalName services to a separate system - ie. https://www.openpolicyagent.org.

This would fall on cluster operators to create policies granting privileges to system components (ie. Knative) & users

2. Allow list

An allow list would allow cluster operators to specify which ExternalName entries are to be included in Contour's routing tier.

I think this list would need to be host/port pairs - ie. Knative uses ExternalName to point to the envoy k8s service of a contour installation. We would only want routing to happen on port 80 or 443 and not any metrics/admin ports etc.

3. EndpointSlices with CNAME Address Type

see: #3950 (comment)

4. Other

There's probably other approaches but I can't think of anything - is there any discussion re: gateway APIs around this?

@dprotaso dprotaso added kind/bug Categorizes issue or PR as related to a bug. lifecycle/needs-triage Indicates that an issue needs to be triaged by a project contributor. labels Aug 12, 2021
@stevesloka
Copy link
Member

  1. I think the first approach is interesting that allowing "system" accounts to have permissions to use the feature, but disallowing users to utilize them. It would move the enforcement to the cluster and out of Contour.

  2. Does this list need to be updated dynamically? We're looking to move the config to a CRD, but that's not there just yet. If it's dynamic, we'll need a way to know that list got updated. Alternatively, could use a regex to setup a range of valid entries for the allowlist.

  3. I'm not sure that GatewayAPI solves this in any way. There are ways to extend gatewayapi that are implementation specific. That's what I was mentioning the internal meeting earlier. Contour knows about all services in the cluster, but doesn't allow for cross-namespace references. However, doing something in this path ends up being specific to Contour and not transferable across other providers.

@tsaarni
Copy link
Member

tsaarni commented Aug 13, 2021

  1. Remove the flag

I guess defaults can always be discussed (including usual "secure by default" vs "backwards compatible" arguments) but I think it would be important to keep secure option without requiring additional software to avoid use of insecure parts.

  1. Allow list

From security point-of-view this could be good: it would avoid users from being in full control of routing destinations - assuming the user cannot also control which address the DNS resolves the destination hostname to.

  1. Other

Could Envoy RBAC filter be used to block the admin API? Though we still are left with the impacts of more general Kubernetes CVE-2021-25740 vulnerability.

@youngnick
Copy link
Member

I think it's important to remember that the reason that we've added this flag and set it to false by default is not only about the Contour CVE (although that was pretty bad), it's also about the fact that you can use ExternalName services to render anything anyone does with External Auth or Rate limiting meaningless, and this is outside of the control of the Contour owner.

Let me explain.

If I have a Service secretsquirrel in the namespace secretsquirrel, that I expose with a HTTPProxy that enables External Auth, Rate Limiting, and any other access-control features we add, having ExternalName Services enabled means that, if any user of the cluster knows that the secretsquirrel Service exists, they can expose it on an unauthenticated, un-rate-limited Ingress by creating an ExternalName Service that points to secretsquirrel.secretsquirrel.svc.cluster.local.

This totally breaks Contour's aim to be a secure-by-default solution, so when we realised that ExternalName Services could be used this way, we had no choice to add this flag and default processing of ExternalName services to off. Contour must be as secure as we can make it by default, without needing to install any extra software. The flag also functions as an "I accept the risks" checkbox.

I know that this breaks Knative, and I'm sorry, but I can't knowingly ship an insecure Contour to the non-Knative users. I understand that Knative works in a different way and isn't really exposing the functionality to users, so the security context of using ExternalName is a bit different, but we gotta look after everyone.

In regards to the specific asks:

  • I'm not prepared to remove the flag totally, or to change the default value. It's just too risky for unaware users. We can expose it via command line flag and env var as well as config file, if that will make things easier, but I'm unconvinced about removing it. I think allowing ExternalName services for any Ingress is a huge security hole.
  • I'm one hundred percent supportive of having an allow list instead of just a boolean flag, since that's even safer than just allowing open slather on DNS names.
  • We haven't discussed ExternalName services explicitly in Gateway API yet, but, since the CVE-2021-25740: Endpoint & EndpointSlice permissions allow cross-Namespace forwarding kubernetes/kubernetes#103675 was found in the course of us designing cross-namespace things for Gateway API, I think we definitely will address it, although I don't really see any other option than saying "don't use ExternalNames".

@xaleeks
Copy link

xaleeks commented Aug 16, 2021

I agree with Nick that as long as ExternalName has this kind of behavior, we should ship with the capability turned off by default to protect users unaware of this vulnerability. It is actually recommended by CNCF if I remember correctly (I will try to dig up the exact phrasing). That should be the default state, realized through whatever means and then we should deliver a way to adjust this through a flag or allowlist for the Knative team.

The reason solution 1 might not be the best is because it seems to assume well established teams with clear separation of duties and knowledge of this issue with ExternalName.

Since we are moving config to a new CR soon, maybe we can expose the allowlist tuning in the CR.

@youngnick
Copy link
Member

Just to clarify, @xaleeks, you mean "since we are moving to a new CR for configuration soon", right? HTTPProxy is staying around for a long time.

@xaleeks
Copy link

xaleeks commented Aug 17, 2021

Yup, we are moving to using a CRD for representing the Contour config and the underlying Envoy so that this representation can be validated and reconciled by k8s. HTTPProxy will be around for a long time indeed.

Have we shared this design with the community yet? If not, we should socialize the design from Steve Sloka

@youngnick
Copy link
Member

I don't think we have shared that design yet, no.

@dprotaso
Copy link
Author

dprotaso commented Aug 24, 2021

Another alternative I discovered is that EndpointSlices support CNAME address types.

They can RBAC'd separately from K8s Services and there's guidance upstream about limiting write access.

If Knative, instead of using Service.ExternalName or managing IP endpoints, created an EndpointSlice with a CNAME record we wouldn't need work around 1 (remove the flag) or 2 (allow list)

Related: #2696

@youngnick
Copy link
Member

youngnick commented Aug 30, 2021

That sounds like a good solution that will end up more secure for everyone @dprotaso. I guess the action for Contour there is to get that EndpointSlice support built?

@dprotaso
Copy link
Author

I guess the action for Contour there is to get that EndpointSlice support built?

Yup. Has there been any progress?

@dprotaso
Copy link
Author

Knative is still on v1.18.x - is there anything new in v1.19 supporting this?

@youngnick
Copy link
Member

There hasn't been any further progress on this yet, sorry.

@github-actions
Copy link

The Contour project currently lacks enough contributors to adequately respond to all Issues.

This bot triages Issues according to the following rules:

  • After 60d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the Issue is closed

You can:

  • Mark this Issue as fresh by commenting
  • Close this Issue
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 12, 2023
@github-actions
Copy link

The Contour project currently lacks enough contributors to adequately respond to all Issues.

This bot triages Issues according to the following rules:

  • After 60d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, the Issue is closed

You can:

  • Mark this Issue as fresh by commenting
  • Close this Issue
  • Offer to help out with triage

Please send feedback to the #contour channel in the Kubernetes Slack

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. knative lifecycle/needs-triage Indicates that an issue needs to be triaged by a project contributor. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

5 participants