Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(distribution): add network policies #302

Merged
merged 48 commits into from
Nov 25, 2024

Conversation

sbruzzese902
Copy link
Contributor

@sbruzzese902 sbruzzese902 commented Nov 15, 2024

Feature: add Network Policies

With this PR we add support to Kubernetes Network Policies for OnPremise kind only.

A parameter spec.distribution.common.networkPoliciesEnabled has been added to enable the creation of Network Policies in core modules:

  • auth
  • ingress
  • logging
  • monitoring
  • opa
  • tracing

By default the new field's value is false, so you have to explicitly set it to true in order to enable it.

Naming convention

For each module, we create:

  • a deny-all policy in its target namespace
  • a all-egress-kube-dns in the same namespace, to allow DNS resolution with kube-dns
  • dedicated policies for each component, with name pattern <selected-component>-<direction>-<target-component>.
  • the only exception to the previous rule is monitoring policies because we took the official upstream ones. In this case, each component has its own policy for both ingress and egress traffic.

Examples

To allow traffic in egress from pomerium to grafana, the policy name is pomerium-egress-grafana.
To allow traffic in ingress from fluentbit to fluentd, the policy name is fluentd-ingress-fluentbit

## Testing

We have tested these scenarios with fury distribution 1.29.4 - furyctl 0.29.10:

  • Ingress: Single | Dual | None

  • Monitoring: Prometheus | Mimir

  • Networking: Calico | Cilium

  • Logging: Loki | Opensearch (Single and Triple)

  • Auth: SSO | Basic Auth | None

  • OPA: Gatekeeper | Kyverno

  • Tracing

  • DR

We have also tested the following migrations:

Nr Module From To
1 - Policies enabled Policies disabled
2 Logging Type: Opensearch Type: Loki
3 Logging Type: Loki Type: Opensearch
4 Logging Backend: Minio-logging Backend: ! Minio
5 Logging Type: Opensearch None
6 Logging Type: Loki None
7 OPA Type: Gatekeeper None
8 OPA Type: Kyverno None
9 Tracing Type: Tempo None
10 Tracing Backend: Minio-tracing Backend: ! Minio
11 Monitoring Type: Prometheus None
12 Monitoring Type: Mimir None
13 Monitoring Backend: Minio-monitoring Backend: ! Minio
14 Ingress Single None
15 Ingress Dual None
16 Auth Type: SSO Type: Basic Auth
17 Auth Type: Basic Auth Type: SSO
18 Auth Type: SSO None
19 Auth Type: Basic Auth None

To do after the new modules' versions are released

  • test again with new fury-distribution and furyctl versions
  • draw network policies diagram
  • change podSelector and config in policies jobs-egress-opensearch and opensearch-ingress-jobs to match the new labels, introduced in logging v4.0.0

Simone Bruzzese and others added 30 commits October 31, 2024 15:13
@ralgozino ralgozino changed the title Feat/add network policies feat(distribution): add network policies Nov 19, 2024
@sbruzzese902 sbruzzese902 marked this pull request as ready for review November 20, 2024 12:06
Copy link
Member

@nutellinoit nutellinoit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this PR is golden ❤️

I have some questions, on some components/rules (note: I did not try to understand all of the network policies):

  1. Can the ingress controller do egress everywhere? (I would like to have no blockers here to use ingress on other namespaces (or outside of the cluster))
  2. Can cert-manager do egress everywhere? (In case of DNS challenge, for example)
  3. Can external-dns do egress everywhere? (same as above)
  4. Can loki do egress everywhere? (for example we are using an external S3 storage)
  5. Can fluentd do egress everywhere? (if we install a different flow output in the cluster, for example)
  6. Can alert manager do egress everywhere? (It needs to connect to arbitrary endpoints to send alerts)
  7. Can prometheus do egress everywhere? (to scrape, but also to send metrics to a different location)
  8. Can mimir do egress everywhere? (to save data on S3 for example)
  9. Can tempo connect egress everywhere and also accept data (ingress) from everywhere?

Thank you

EDIT:

  1. Can pomerium connect everywhere? for example, if we want to have additional SSO ingress protected by pomerium

@sbruzzese902
Copy link
Contributor Author

  1. Yes
  2. cert-manager can do egress towards port 53 TCP/UDP, 80 TCP and 443 TCP globally (we followed this )
  3. It seems we forgot external-dns 😢, let us fix it!
  4. This needs to be added. We only created loki=>minio rule.
  5. At the moment fluentd can only perform egress towards minio and opensearch OR loki.
  6. Yes
  7. Yes
  8. mimir can perform egress globally, if its backend is not minio.
  9. tempo backend is the same as mimir, but it accepts traffic only from grafana.
  10. pomerium can only communicate with the known services from the distro (e.g. forecastle, grafana...).

We can be more permissive on the traffic, so we avoid breaking stuff or forcing the creation of many network policies.

@nutellinoit
Copy link
Member

We can be more permissive on the traffic, so we avoid breaking stuff or forcing the creation of many network policies.

Yes please, we need balance between security and ease of use

@sbruzzese902
Copy link
Contributor Author

sbruzzese902 commented Nov 21, 2024

Everything was changed as we agreed, including the diagrams 😄 Thanks @stefanoghinelli 💙

@nutellinoit nutellinoit self-requested a review November 25, 2024 15:33
Copy link
Member

@nutellinoit nutellinoit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🚀

@nutellinoit nutellinoit merged commit 1b06571 into feat/release-v1.30.0 Nov 25, 2024
1 check passed
@ralgozino ralgozino deleted the feat/add-network-policies branch November 25, 2024 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants