I should be able to serve metrics over https #396

shawn-hurley · 2018-08-10T20:48:06Z

Currently the ExposeMetricsPort() does not allow us to serve metrics over https.

I propose that we add an options struct, this will allow a user to specify the certs (maybe even they can generate with the TLS certs?) or if nothing is set, then this happens as it currently works.

We would need to introduce a breaking change to this function. We could just introduce a new function but my vote would be to break the API and have it accept an options struct.

The text was updated successfully, but these errors were encountered:

openshift-bot · 2019-05-09T18:15:27Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2019-06-08T19:46:45Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2019-07-08T21:11:03Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci-robot · 2019-07-08T21:11:10Z

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

hasbro17 · 2019-07-09T00:05:40Z

@lilic Do we need to reopen this? Don't think we have this yet.

lilic · 2019-07-09T10:33:12Z

I am working on this right now.

/reopen

openshift-ci-robot · 2019-07-09T10:33:13Z

@lilic: Reopened this issue.

In response to this:

I am working on this right now.

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

lilic · 2019-07-09T10:33:32Z

/remove-lifecycle rotten

lilic · 2019-07-10T12:13:08Z

TLS metrics for operators deployed in Open Shift 4.x

The following is based on how kube-state-metrics secures TLS metrics endpoint in Open Shift. For Kubernetes we will add documentation to describe how to secure the metrics endpoints, this is because there is no native solution for this in kubernetes itself. kubebuilder suggests using kube-rbab-proxy and cert-manager for this, and they do not enable this by default but you have to opt in. We can provide docs and example manifests how to do this and users can follow that.

TLS in Open Shift 4.x

We need to add the annotation to the service object before the operator is deployed, as we reference the created secret in the deployment itself. So all of these reasources need to be created before the operator starts. So the Service, ServiceMonitor objects will be created using manifest files instead of programitcly as we do right now, this is just for openshift right now. We need a new subcommand operator-sdk generate openshift This command can be useful to generate other openshift specific things later on. Right now what it would do is generate the deploy-openshift directory.

The content of the deploy-openshift directory:

crds/          service-monitor.yaml  service.yaml       operator.yaml role.yaml            role_binding.yaml   service_account.yaml

Content of operator.yaml - deployment manifest file:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: memcached-operator
  labels:
    name: memcached-operator
spec:
  replicas: 1
  selector:
    matchLabels:
      name: memcached-operator
  template:
    metadata:
      labels:
        name: memcached-operator
    spec:
      serviceAccountName: memcached-operator
      volumes:
      - name: memcached-operator
        secret:
          secretName: memcached-operator
      - emptyDir: {}
        name: volume-directive-shadow
      containers:
        - name: memcached-operator
          image: myimage:latest
          command:
          - operator-metrics-tls
          imagePullPolicy: Always
          env:
            - name: WATCH_NAMESPACE
              valueFrom:
                fieldRef:
                  fieldPath: metadata.namespace
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: OPERATOR_NAME
              value: "memcached-operator"
        - args:
          - --logtostderr
          - -v=8
          - --secure-listen-address=:9393
          - --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
          - --upstream=http://127.0.0.1:8383/
          - --tls-cert-file=/etc/tls/private/tls.crt
          - --tls-private-key-file=/etc/tls/private/tls.key
          image: quay.io/coreos/kube-rbac-proxy:v0.4.1
          name: kube-rbac-proxy-main
          ports:
          - containerPort: 9393
            name: op-metrics
          resources:
            requests:
              cpu: 10m
              memory: 40Mi
          volumeMounts:
          - mountPath: /etc/tls/private
            name: operator-metrics-tls
            readOnly: false
        - args:
          - --logtostderr
          - -v=8
          - --secure-listen-address=:9696
          - --tls-cipher-suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
          - --upstream=http://127.0.0.1:8686/
          - --tls-cert-file=/etc/tls/private/tls.crt
          - --tls-private-key-file=/etc/tls/private/tls.key
          image: quay.io/coreos/kube-rbac-proxy:v0.4.1
          name: kube-rbac-proxy-self
          ports:
          - containerPort: 9696
            name: cr-metrics
          resources:
            requests:
              cpu: 10m
              memory: 40Mi
          volumeMounts:
          - mountPath: /etc/tls/private
            name: memcached-operator
            readOnly: false

content of the service.yaml file:

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.alpha.openshift.io/serving-cert-secret-name: memcached-operator
  labels:
    name: memcached-operator
  name: memcached-operator
  namespace: default
spec:
  ports:
  - name: cr-metrics
    port: 9696
    targetPort: cr-metrics
  - name: op-metrics
    port: 9393
    targetPort: op-metrics
  selector:
    name: memcached-operator
  type: ClusterIP

content of the service-monitor.yaml file:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: memcached-operator
  name: memcached-operator
  namespace: default
spec:
  endpoints:
  - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    interval: 2m
    port: cr-metrics
    scheme: https
    scrapeTimeout: 2m
    tlsConfig:
      caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt
      serverName: memcached-operator.default.svc
  - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
    honorLabels: true
    interval: 2m
    port: op-metrics
    scheme: https
    scrapeTimeout: 2m
    tlsConfig:
      caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt
      serverName: memcached-operator.default.svc
  jobLabel: memcached-operator
  selector:
    matchLabels:
        name: memcached-operator

Those are the only changes to the files we need to make.

Open questions

Does this work the same way in 3.x versions of open shift?
We can also do work when a new operator is created, if the user knows it will be deployed on openshift only, as part of a flag of the new subcommand. The flag would be something like operator-sdk new --openshift and it would generate the same resources as above.

lilic · 2019-07-10T14:57:51Z

cc @estroz @shawn-hurley @hasbro17 @AlexNPavel @joelanford Can you let me know if you have any objections, otherwise will start implementing the above. Thanks!

ironcladlou · 2019-07-10T19:04:14Z

@lilic

We need to add the annotation to the service object before the operator is deployed, as we reference the created secret in the deployment itself. So all of these reasources need to be created before the operator starts.

A deployment can reference a non-existent secret in a volume mount, and if/when the secret does exist, the replica containers will successfully create. Given that, the secret and deployment creation order should be fundamentally irrelevant (modulo any peripheral concerns, e.g. backoff time re-creating the container if there are initial failures due to the absence of the secret).

It wasn't clear to me from your statement whether we agree on that assumption. I'm not sure if it would affect your design one way or another, but it's something that stood out to me.

lilic · 2019-07-11T09:27:15Z

@ironcladlou

A deployment can reference a non-existent secret in a volume mount, and if/when the secret does exist, the replica containers will successfully create.

Yes it can reference it, but currently, we create the Service and ServiceMonitor resources programmatically in the operator itself, so we have to change to the above design to create these resources alongside the operators deployment (regardless of the order of course). As like you said the container will be in ContainerCreating state until the Secret is not mounted but the Service and ServiceMonitor will never be created. But agreed that was not clear from my comment above, thanks for commenting!

ironcladlou · 2019-07-11T15:03:02Z

@lilic I think your elaboration on the current behavior completed the picture for me. Thank you! Your proposal sounds good to me.

openshift-bot · 2019-10-09T20:22:16Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2020-01-07T23:16:37Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2020-02-07T00:59:51Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

estroz · 2020-03-02T19:10:10Z

/lifecycle frozen

camilamacedo86 · 2020-07-10T01:50:42Z

Hi @joelanford @theishshah,

If I am not misunderstanding the goal of this task, is not it solved with https://github.com/brancz/kube-rbac-proxy which is used by default in the new layout? See https://github.com/operator-framework/operator-sdk-samples/blob/master/go/kubebuilder/memcached-operator/config/rbac/auth_proxy_service.yaml.

estroz · 2020-07-24T00:05:02Z

This will be done once the ansible plugin is complete: #3488.

camilamacedo86 · 2020-07-25T08:20:58Z

The ansible plugin is merged 👍 Closing this one.

lilic added the metrics label Nov 16, 2018

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 9, 2019

openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 8, 2019

openshift-ci-robot closed this as completed Jul 8, 2019

hasbro17 assigned lilic Jul 9, 2019

openshift-ci-robot reopened this Jul 9, 2019

openshift-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jul 9, 2019

lilic mentioned this issue Jul 17, 2019

Add openshift subcommand to generate #1687

Closed

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 9, 2019

camilamacedo86 removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 9, 2019

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 7, 2020

openshift-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 7, 2020

estroz unassigned lilic Mar 2, 2020

estroz added the triage/needs-information Indicates an issue needs more information in order to work on it. label Mar 2, 2020

openshift-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Mar 2, 2020

estroz added this to the v1.0.0 milestone Mar 2, 2020

estroz self-assigned this Mar 2, 2020

estroz added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Mar 2, 2020

theishshah self-assigned this Mar 4, 2020

estroz removed their assignment Jun 15, 2020

estroz mentioned this issue Jul 20, 2020

Remove pkg/tls #3468

Merged

estroz unassigned theishshah Jul 24, 2020

camilamacedo86 closed this as completed Jul 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I should be able to serve metrics over https #396

I should be able to serve metrics over https #396

shawn-hurley commented Aug 10, 2018

openshift-bot commented May 9, 2019

openshift-bot commented Jun 8, 2019

openshift-bot commented Jul 8, 2019

openshift-ci-robot commented Jul 8, 2019

hasbro17 commented Jul 9, 2019

lilic commented Jul 9, 2019

openshift-ci-robot commented Jul 9, 2019

lilic commented Jul 9, 2019

lilic commented Jul 10, 2019

lilic commented Jul 10, 2019

ironcladlou commented Jul 10, 2019

lilic commented Jul 11, 2019 •

edited

Loading

ironcladlou commented Jul 11, 2019

openshift-bot commented Oct 9, 2019

openshift-bot commented Jan 7, 2020

openshift-bot commented Feb 7, 2020

estroz commented Mar 2, 2020

camilamacedo86 commented Jul 10, 2020 •

edited

Loading

estroz commented Jul 24, 2020

camilamacedo86 commented Jul 25, 2020

I should be able to serve metrics over https #396

I should be able to serve metrics over https #396

Comments

shawn-hurley commented Aug 10, 2018

openshift-bot commented May 9, 2019

openshift-bot commented Jun 8, 2019

openshift-bot commented Jul 8, 2019

openshift-ci-robot commented Jul 8, 2019

hasbro17 commented Jul 9, 2019

lilic commented Jul 9, 2019

openshift-ci-robot commented Jul 9, 2019

lilic commented Jul 9, 2019

lilic commented Jul 10, 2019

TLS metrics for operators deployed in Open Shift 4.x

TLS in Open Shift 4.x

Open questions

lilic commented Jul 10, 2019

ironcladlou commented Jul 10, 2019

lilic commented Jul 11, 2019 • edited Loading

ironcladlou commented Jul 11, 2019

openshift-bot commented Oct 9, 2019

openshift-bot commented Jan 7, 2020

openshift-bot commented Feb 7, 2020

estroz commented Mar 2, 2020

camilamacedo86 commented Jul 10, 2020 • edited Loading

estroz commented Jul 24, 2020

camilamacedo86 commented Jul 25, 2020

lilic commented Jul 11, 2019 •

edited

Loading

camilamacedo86 commented Jul 10, 2020 •

edited

Loading