Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubebuilder scaffolding with HTTPS endpoint for metrics endpoint with custom certs doesn't work #4533

Closed
abhishekdwivedi3060 opened this issue Jan 30, 2025 · 5 comments · Fixed by #4536 or #4558
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@abhishekdwivedi3060
Copy link
Contributor

What broke? What's expected?

Broken:
Metrics server with HTTPS and cert-manager certs doesn't work

  1. config/prometheus/monitor_tls_patch.yaml path completely overwrites the ServiceMonitor endpoints field, resulting in omission of existing fields like path, scheme etc
  2. Prometheus is not able to validate the metric server certificate because it is connecting all underlying service pods over POD IP whereas certificate SANs have metrics service names. So it’s not able to validate the host authenticity. Its throwing error:
x509: cannot validate certificate for 10.20.5.28 because it doesn't contain any IP SANs

PFA screenshot for the same:

Image

Expected:

  1. config/prometheus/monitor_tls_patch.yaml patch should only update the tlsConfig field with the required certs, leaving all existing fields intact.
  2. Prometheus should be able to validate the metrics server certs and scrape the metrics successfully.

Reproducing this issue

No response

KubeBuilder (CLI) Version

4.4.0

PROJECT version

3

Plugin versions

go.kubebuilder.io/v4

Other versions

No response

Extra Labels

No response

@abhishekdwivedi3060 abhishekdwivedi3060 added the kind/bug Categorizes issue or PR as related to a bug. label Jan 30, 2025
@abhishekdwivedi3060
Copy link
Contributor Author

Workaround/fix for now (until this is fixed) :

  1. Manually add the old omitted fields in the ServiceMonitor file.
  2. Add serverName in the tlsConfig so that Prometheus can validate the metrics server cert SANs entries

The resultant ServiceMonitor file will look like this:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/name: operator
    control-plane: controller-manager
  name: controller-manager-metrics-monitor
spec:
  endpoints:
    - path: /metrics
      interval: 15s
      port: https
      scheme: https
      bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
      tlsConfig:
        serverName: operator-controller-manager-metrics-service.default.svc  # dnsNames from the metrics-certs Certificate
        ca:
          secret:
            key: ca.crt
            name: metrics-server-cert
        cert:
          secret:
            key: tls.crt
            name: metrics-server-cert
        insecureSkipVerify: false
        keySecret:
          key: tls.key
          name: metrics-server-cert
  selector:
    matchLabels:
      control-plane: controller-manager

@camilamacedo86 camilamacedo86 added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Feb 1, 2025
@camilamacedo86
Copy link
Member

Hi @abhishekdwivedi3060

To do this change we need change the template here:

const serviceMonitorPatchTemplate = `# Patch for Prometheus ServiceMonitor to enable secure TLS configuration
# using certificates managed by cert-manager
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: controller-manager-metrics-monitor
namespace: system
spec:
endpoints:
- tlsConfig:
insecureSkipVerify: false
ca:
secret:
name: metrics-server-cert
key: ca.crt
cert:
secret:
name: metrics-server-cert
key: tls.crt
keySecret:
name: metrics-server-cert

Then, run make generate to ensure that all samples under testdata and docs will be updated.

Would you like to push a PR with this fix?

Again, thank you a lot for raising this one. 🥇

@abhishekdwivedi3060
Copy link
Contributor Author

Hi @camilamacedo86 , yes will raise a PR in some time.

@camilamacedo86
Copy link
Member

For some reason, the tests were not executed, and the change was merged without properly passing them. Unfortunately, we need to revert the change as it does not work well and is breaking the test data: link to PR.

@camilamacedo86
Copy link
Member

Hi @abhishekdwivedi3060,

It was my mistake—I overlooked it during the review. Also, we were unlucky because, for some reason, the tests were skipped when they shouldn't have been, and the issue wasn't caught.

I've now added your changes and credited you for the fix here: https://github.com/kubernetes-sigs/kubebuilder/pull/4558/files.

The missing pieces were:

Fixing the uncommented lines in the GitHub Action that runs the E2E tests for the samples.
Applying the same fix to Helm Charts as well.
Thanks again for catching this and contributing to the fix! 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
2 participants