Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Kyverno gives error when installed with KEDA #2267

Closed
NoSkillGirl opened this issue Aug 16, 2021 · 5 comments
Closed

[BUG] Kyverno gives error when installed with KEDA #2267

NoSkillGirl opened this issue Aug 16, 2021 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@NoSkillGirl
Copy link
Contributor

Software version numbers

  • Kyverno version: 1.4.2

Describe the bug
Installing Kyverno with KEDA gives error in log with external.metrics.k8s.io/v1beta1.

E0816 08:04:11.013563       1 memcache.go:196] couldn't get resource list for external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1
E0816 08:04:11.953874       1 crdSync.go:68]  "msg"="failed to update in-cluster api versions" "error"="unable to fetch apiResourceLists: unable to retrieve the complete list of server APIs: external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1"  
E0816 08:04:12.184100       1 crdSync.go:107]  "msg"="sync failed, unable to update in-cluster api versions" "error"="unable to fetch apiResourceLists: unable to retrieve the complete list of server APIs: external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1"  
E0816 08:04:12.316789       1 crdSync.go:107]  "msg"="sync failed, unable to update in-cluster api versions" "error"="unable to fetch apiResourceLists: unable to retrieve the complete list of server APIs: external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1"  

To Reproduce

  1. Install KEDA using command - kubectl apply -f https://github.com/kedacore/keda/releases/download/v2.4.0/keda-2.4.0.yaml
  2. Install Kyverno
  3. Check the kyverno log:
$ kubectl -n kyverno logs deployments/kyverno -f                                   [master]
I0816 08:04:10.900470       1 version.go:17]  "msg"="Kyverno"  "Version"="v1.4.2"
I0816 08:04:10.900531       1 version.go:18]  "msg"="Kyverno"  "BuildHash"="(HEAD/fb6e0f18ea89c9b60c604e5135f38040fafbc1e4"
I0816 08:04:10.900543       1 version.go:19]  "msg"="Kyverno"  "BuildTime"="2021-08-11_08:24:18PM"
I0816 08:04:10.900854       1 config.go:92] CreateClientConfig "msg"="Using in-cluster configuration"  
I0816 08:04:10.902645       1 main.go:122] setup "msg"="enabling metrics service"  "address"=":8000"
E0816 08:04:11.013563       1 memcache.go:196] couldn't get resource list for external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1
I0816 08:04:11.013938       1 util.go:86]  "msg"="CRD found"  "gvr"="kyverno.io/v1, Resource=clusterpolicies"
I0816 08:04:11.014379       1 util.go:86]  "msg"="CRD found"  "gvr"="wgpolicyk8s.io/v1alpha1, Resource=clusterpolicyreports"
I0816 08:04:11.014571       1 util.go:86]  "msg"="CRD found"  "gvr"="wgpolicyk8s.io/v1alpha1, Resource=policyreports"
I0816 08:04:11.014730       1 util.go:86]  "msg"="CRD found"  "gvr"="kyverno.io/v1alpha1, Resource=clusterreportchangerequests"
I0816 08:04:11.014841       1 util.go:86]  "msg"="CRD found"  "gvr"="kyverno.io/v1alpha1, Resource=reportchangerequests"
I0816 08:04:11.421455       1 dynamicconfig.go:150] ConfigData "msg"="init configuration from commandline arguments for filterK8sResources"  
I0816 08:04:11.421964       1 dynamicconfig.go:332] ConfigData "msg"="Init resource filters"  "filters"=[{"Kind":"Event","Namespace":"*","Name":"*"},{"Kind":"*","Namespace":"kube-system","Name":"*"},{"Kind":"*","Namespace":"kube-public","Name":"*"},{"Kind":"*","Namespace":"kube-node-lease","Name":"*"},{"Kind":"Node","Namespace":"*","Name":"*"},{"Kind":"APIService","Namespace":"*","Name":"*"},{"Kind":"TokenReview","Namespace":"*","Name":"*"},{"Kind":"SubjectAccessReview","Namespace":"*","Name":"*"},{"Kind":"*","Namespace":"kyverno","Name":"*"},{"Kind":"Binding","Namespace":"*","Name":"*"},{"Kind":"ReplicaSet","Namespace":"*","Name":"*"},{"Kind":"ReportChangeRequest","Namespace":"*","Name":"*"},{"Kind":"ClusterReportChangeRequest","Namespace":"*","Name":"*"},{"Kind":"PolicyReport","Namespace":"*","Name":"*"},{"Kind":"ClusterPolicyReport","Namespace":"*","Name":"*"}]
I0816 08:04:11.421987       1 dynamicconfig.go:343] ConfigData "msg"="Init resource "  "excludeRoles"=""
I0816 08:04:11.427007       1 leaderelection.go:243] attempting to acquire leader lease kyverno/webhook-register...
I0816 08:04:11.463060       1 leaderelection.go:253] successfully acquired lease kyverno/webhook-register
I0816 08:04:11.463467       1 leaderelection.go:94] webhookRegister/LeaderElection "msg"="started leading" "id"="kyverno-7dc8969bfc-fptqt_558f8f46-609d-4992-ae4f-9ecefbd185b4" 
I0816 08:04:11.467178       1 certRenewer.go:85] CertRenewer/InitTLSPemPair "msg"="building key/certificate pair for TLS"  
I0816 08:04:11.587470       1 certRenewer.go:145] CertRenewer/CAcert "msg"="secret created"  "name"="kyverno-svc.kyverno.svc.kyverno-tls-ca" "namespace"="kyverno"
I0816 08:04:11.764106       1 certRenewer.go:198] CertRenewer/WriteTLSPair "msg"="secret created"  "name"="kyverno-svc.kyverno.svc.kyverno-tls-pair" "namespace"="kyverno"
E0816 08:04:11.953874       1 crdSync.go:68]  "msg"="failed to update in-cluster api versions" "error"="unable to fetch apiResourceLists: unable to retrieve the complete list of server APIs: external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1"  
E0816 08:04:12.184100       1 crdSync.go:107]  "msg"="sync failed, unable to update in-cluster api versions" "error"="unable to fetch apiResourceLists: unable to retrieve the complete list of server APIs: external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1"  
E0816 08:04:12.316789       1 crdSync.go:107]  "msg"="sync failed, unable to update in-cluster api versions" "error"="unable to fetch apiResourceLists: unable to retrieve the complete list of server APIs: external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1"  
I0816 08:04:12.783840       1 registration.go:450] Register "msg"="webhook configuration deleted" "kind"="ValidatingWebhookConfiguration" "name"="kyverno-policy-validating-webhook-cfg" 

Expected behavior
Kyverno should not give any error.

@NoSkillGirl NoSkillGirl added the bug Something isn't working label Aug 16, 2021
@NoSkillGirl
Copy link
Contributor Author

There is already relevant discussion in KEDA here

This is not kyverno error. This error is from client-go when there are no resources available in external.metrics.k8s.io/v1beta1

here in client-go, it gets all ServerGroups.
When KEDA is not installed then external.metrics.k8s.io/v1beta1 is not part of ServerGroups and hence its not called and therefore no issue.

But when KEDA is installed then it creates an ApiService

$ kubectl get apiservice | grep keda-metrics
v1beta1.external.metrics.k8s.io        keda/keda-metrics-apiserver   True        20m

But it doesn't create any external.metrics.k8s.io resources

$ kubectl get --raw /apis/external.metrics.k8s.io/v1beta1 | jq .
{
  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "external.metrics.k8s.io/v1beta1",
  "resources": []
}

Since there are no resources, client-go throws an error here

If someone is stuck with this issue and can't move forward,

Errors disappeared once we registered a dummy prometheus scaledObject.

According to this comment here,

@dolefirenko
Copy link

confirmed, the issye is still present. kyverno v2.3.3, keda v2.6.2. And some dummy Prometheus scaled object is fixing this as a workaround.
The proposition is: if there are no scaled objects - /apis/external.metrics.k8s.io/v1beta1 resources shouldn't be empty. Let it returns 1 value, let it be current time....

@chipzoller
Copy link
Contributor

Related to #3244

@vyankyGH
Copy link
Contributor

vyankyGH commented Jul 1, 2022

closing with PR #4139

@jim-barber-he
Copy link

I'm not sure if this is the same problem or not, but it does not seem to be fixed for me?

We have KEDA version 2.7.1 and Kyverno version 2.7.2 installed into our cluster.
I just tried to install a Kyverno policy to annotate a daemonset/deployment when a secret has been updated.
This is to replace a hacky CronJob that we had that was performing the task.

When using Helm to apply my changes it errors out with:

Error: UPGRADE FAILED: an error occurred while rolling back the release. original upgrade error: failed to create resource: admission webhook "validate-policy.kyverno.svc" denied the request: unable to retrieve the complete list of server APIs: external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1: no ServiceAccount with the name "traefik-cron" found

Uninstalling KEDA from the cluster and trying the helm upgrade again worked with no issue.
The full output from the Helm upgrade that removes the cronjob and its supporting RBAC and service accounts and adding in the Kyverno policy looks as follows:

Comparing release=traefik-additions, chart=helm/charts/traefik-additions
Enabled three way merge via the envvar
traefik, traefik-cron, CronJob (batch) has been removed:
- apiVersion: batch/v1
- kind: CronJob
- metadata:
-   labels:
-     app.kubernetes.io/instance: traefik
-     app.kubernetes.io/managed-by: Helm
-     app.kubernetes.io/name: traefik
-     helm.sh/chart: traefik-additions-1.0.0
-   name: traefik-cron
-   namespace: traefik
- spec:
-   concurrencyPolicy: Replace
-   jobTemplate:
-     metadata:
-       labels:
-         app.kubernetes.io/instance: traefik
-         app.kubernetes.io/managed-by: Helm
-         app.kubernetes.io/name: traefik
-         helm.sh/chart: traefik-additions-1.0.0
-     spec:
-       backoffLimit: 0
-       template:
-         spec:
-           containers:
-           - env:
-             - name: PYTHONUNBUFFERED
-               value: "1"
-             - name: TLS_SECRET
-               value: tls-wildcard.test1.apps.he0.io
-             - name: TRAEFIK_DAEMONSET_NAMES
-               value: traefik
-             - name: TRAEFIK_NAMESPACE
-               value: traefik
-             image: 340978087534.dkr.ecr.ap-southeast-2.amazonaws.com/docker-python:master
-             imagePullPolicy: IfNotPresent
-             name: traefik-cron
-             resources: {}
-             volumeMounts:
-             - mountPath: /usr/local/bin/entrypoint
-               name: traefik-files
-               subPath: cron.py
-           restartPolicy: Never
-           serviceAccountName: traefik-cron
-           volumes:
-           - configMap:
-               defaultMode: 493
-               name: traefik-files
-             name: traefik-files
-   schedule: 00 05 * * 1-5
+ 
traefik, traefik-cron, Role (rbac.authorization.k8s.io) has been removed:
- apiVersion: rbac.authorization.k8s.io/v1
- kind: Role
- metadata:
-   labels:
-     app.kubernetes.io/instance: traefik
-     app.kubernetes.io/managed-by: Helm
-     app.kubernetes.io/name: traefik
-     helm.sh/chart: traefik-additions-1.0.0
-   name: traefik-cron
-   namespace: traefik
- rules:
- - apiGroups:
-   - ""
-   resourceNames:
-   - tls-wildcard.test1.apps.he0.io
-   resources:
-   - secrets
-   verbs:
-   - get
- - apiGroups:
-   - apps
-   resourceNames:
-   - traefik
-   resources:
-   - daemonsets
-   - deployments
-   verbs:
-   - patch
+ 
traefik, traefik-cron, RoleBinding (rbac.authorization.k8s.io) has been removed:
- apiVersion: rbac.authorization.k8s.io/v1
- kind: RoleBinding
- metadata:
-   labels:
-     app.kubernetes.io/instance: traefik
-     app.kubernetes.io/managed-by: Helm
-     app.kubernetes.io/name: traefik
-     helm.sh/chart: traefik-additions-1.0.0
-   name: traefik-cron
-   namespace: traefik
- roleRef:
-   apiGroup: rbac.authorization.k8s.io
-   kind: Role
-   name: traefik-cron
- subjects:
- - kind: ServiceAccount
-   name: traefik-cron
-   namespace: traefik
+ 
traefik, traefik-cron, ServiceAccount (v1) has been removed:
- apiVersion: v1
- kind: ServiceAccount
- metadata:
-   labels:
-     app.kubernetes.io/instance: traefik
-     app.kubernetes.io/managed-by: Helm
-     app.kubernetes.io/name: traefik
-     helm.sh/chart: traefik-additions-1.0.0
-   name: traefik-cron
-   namespace: traefik
+ 
traefik, traefik-files, ConfigMap (v1) has been removed:
- apiVersion: v1
- data:
-   cron.py: |
-     #!/usr/bin/env python3
-     """
-     This CronJob gets a checksum for the secret that holds the SSL certificate used by Traefik.
-     It then writes this checksum as an annotation on the traefik Deployments or DaemonSets.
-     That way if the certificates have changed, then the Traefik pods will be rolled to pick up the change.
-     """
-     import hashlib
-     import json
-     import logging
-     import os
-     import sys
- 
-     import kubernetes
- 
-     LOGGER = logging.getLogger("traefik")
-     LOGGER.setLevel(logging.INFO)
-     CH = logging.StreamHandler()
-     CH.setLevel(logging.DEBUG)
-     FORMATTER = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
-     CH.setFormatter(FORMATTER)
-     LOGGER.addHandler(CH)
- 
-     TLS_SECRET = os.getenv("TLS_SECRET")
- 
-     TRAEFIK_DAEMONSET_NAMES = os.getenv("TRAEFIK_DAEMONSET_NAMES")
-     TRAEFIK_DEPLOYMENT_NAMES = os.getenv("TRAEFIK_DEPLOYMENT_NAMES")
-     TRAEFIK_NAMESPACE = os.getenv("TRAEFIK_NAMESPACE")
- 
- 
-     class Kube:
-         """
-         Kubernetes Helper Class
-         """
- 
-         def __init__(self):
-             kubernetes.config.load_incluster_config()
-             self.apps_v1 = kubernetes.client.AppsV1Api()
-             self.core_v1 = kubernetes.client.CoreV1Api()
- 
-             self.daemonset = self.Daemonset(self.apps_v1)
-             self.deployment = self.Deployment(self.apps_v1)
-             self.secret = self.Secret(self.core_v1)
- 
-         class Daemonset:
-             """
-             Daemonset Helper Class
-             """
- 
-             def __init__(self, api):
-                 self.api = api
- 
-             def patch_annotations(self, name, annotations, namespace=TRAEFIK_NAMESPACE):
-                 """
-                 Patch a daemonset with annotations
-                 """
-                 patch = {"spec": {"template": {"metadata": {"annotations": annotations}}}}
-                 req = self.api.patch_namespaced_daemon_set(name=name, namespace=namespace, body=patch)
-                 LOGGER.info("Patched annotations on %s daemonset", name)
-                 return req
- 
-         class Deployment:
-             """
-             Deployment Helper Class
-             """
- 
-             def __init__(self, api):
-                 self.api = api
- 
-             def patch_annotations(self, name, annotations, namespace=TRAEFIK_NAMESPACE):
-                 """
-                 Patch a deployment with annotations
-                 """
-                 patch = {"spec": {"template": {"metadata": {"annotations": annotations}}}}
-                 req = self.api.patch_namespaced_deployment(name=name, namespace=namespace, body=patch)
-                 LOGGER.info("Patched annotations on %s deployment", name)
-                 return req
- 
-         class Secret:
-             """
-             Secret Helper Class
-             """
- 
-             def __init__(self, api):
-                 self.api = api
- 
-             def get(self, name, namespace=TRAEFIK_NAMESPACE):
-                 """
-                 Get a secret
-                 """
-                 return self.api.read_namespaced_secret(name=name, namespace=namespace)
- 
- 
-     def main():
-         """
-         main
-         """
-         # Check some constants have been set properly from environment variables.
-         if not TLS_SECRET:
-             print("Error. The TLS_SECRET environment variable was not set.")
-             sys.exit(1)
-         if not TRAEFIK_NAMESPACE:
-             print("Error. The TRAEFIK_NAMESPACE environment variable was not set.")
-             sys.exit(1)
- 
-         LOGGER.debug("Instantiating Kube Class")
-         kube = Kube()
- 
-         # Get the checksum of the secret holding the SSL certificate so that we can use it as an annotation.
-         LOGGER.debug("Reading %s secret", TLS_SECRET)
-         checksum = hashlib.sha256(json.dumps(kube.secret.get(TLS_SECRET).data, sort_keys=True).encode("utf-8")).hexdigest()
- 
-         if TRAEFIK_DAEMONSET_NAMES:
-             for daemonset in TRAEFIK_DAEMONSET_NAMES.split(","):
-                 kube.daemonset.patch_annotations(name=daemonset, annotations={"certificate_checksum": checksum})
-         if TRAEFIK_DEPLOYMENT_NAMES:
-             for deployment in TRAEFIK_DEPLOYMENT_NAMES.split(","):
-                 kube.deployment.patch_annotations(name=deployment, annotations={"certificate_checksum": checksum})
- 

-     if __name__ == "__main__":
-         main()
- kind: ConfigMap
- metadata:
-   labels:
-     app.kubernetes.io/instance: traefik
-     app.kubernetes.io/managed-by: Helm
-     app.kubernetes.io/name: traefik
-     helm.sh/chart: traefik-additions-1.0.0
-   name: traefik-files
-   namespace: traefik
traefik, kyverno:update-traefik, Role (rbac.authorization.k8s.io) has been added:
- 
+ apiVersion: rbac.authorization.k8s.io/v1
+ kind: Role
+ metadata:
+   labels:
+     app.kubernetes.io/instance: traefik
+     app.kubernetes.io/managed-by: Helm
+     app.kubernetes.io/name: traefik
+     helm.sh/chart: traefik-additions-1.0.0
+   name: kyverno:update-traefik
+   namespace: traefik
+ rules:
+ - apiGroups:
+   - apps
+   resourceNames:
+   - traefik
+   resources:
+   - daemonsets
+   - deployments
+   verbs:
+   - update
traefik, kyverno:update-traefik, RoleBinding (rbac.authorization.k8s.io) has been added:
- 
+ apiVersion: rbac.authorization.k8s.io/v1
+ kind: RoleBinding
+ metadata:
+   labels:
+     app.kubernetes.io/instance: traefik
+     app.kubernetes.io/managed-by: Helm
+     app.kubernetes.io/name: traefik
+     helm.sh/chart: traefik-additions-1.0.0
+   name: kyverno:update-traefik
+   namespace: traefik
+ roleRef:
+   apiGroup: rbac.authorization.k8s.io
+   kind: Role
+   name: kyverno:update-traefik
+ subjects:
+ - kind: ServiceAccount
+   name: kyverno
+   namespace: kyverno
traefik, restart-traefik-on-tls-secret-change, Policy (kyverno.io) has been added:
- 
+ apiVersion: kyverno.io/v1
+ kind: Policy
+ metadata:
+   annotations:
+     policies.kyverno.io/description: If the Secret that holds the SSL certificate
+       is updated, then write an annotation to the traefik DaemonSet or Deployment
+       to roll the pods.
+     policies.kyverno.io/title: Restart Traefik On TLS Secret Change
+   name: restart-traefik-on-tls-secret-change
+   namespace: traefik
+ spec:
+   mutateExistingOnPolicyUpdate: false
+   rules:
+   - match:
+       any:
+       - resources:
+           kinds:
+           - Secret
+           names:
+           - tls-wildcard.test1.apps.he0.io
+     mutate:
+       patchStrategicMerge:
+         spec:
+           template:
+             metadata:
+               annotations:
+                 tls_secret_version: '{{request.object.metadata.resourceVersion}}'
+       targets:
+       - apiVersion: apps/v1
+         kind: DaemonSet
+         name: traefik
+         namespace: traefik
+     name: update-traefik-annotation-on-tls-secret-change
+     preconditions:
+       all:
+       - key: '{{request.operation}}'
+         operator: Equals
+         value: UPDATE

Comparing release=traefik, chart=traefik/traefik
Affected releases are:
  traefik-additions (./helm/charts/traefik-additions) UPDATED

Do you really want to apply?
  Helmfile will apply all your changes, as shown above.

 [y/n]: y

hook[presync] logs | customresourcedefinition.apiextensions.k8s.io/ingressroutes.traefik.containo.us configured
hook[presync] logs | customresourcedefinition.apiextensions.k8s.io/ingressroutetcps.traefik.containo.us configured
hook[presync] logs | customresourcedefinition.apiextensions.k8s.io/ingressrouteudps.traefik.containo.us configured
hook[presync] logs | customresourcedefinition.apiextensions.k8s.io/middlewares.traefik.containo.us configured
hook[presync] logs | customresourcedefinition.apiextensions.k8s.io/middlewaretcps.traefik.containo.us configured
hook[presync] logs | customresourcedefinition.apiextensions.k8s.io/serverstransports.traefik.containo.us configured
hook[presync] logs | customresourcedefinition.apiextensions.k8s.io/tlsoptions.traefik.containo.us configured
hook[presync] logs | customresourcedefinition.apiextensions.k8s.io/tlsstores.traefik.containo.us configured
hook[presync] logs | customresourcedefinition.apiextensions.k8s.io/traefikservices.traefik.containo.us configured
hook[presync] logs | 
Upgrading release=traefik-additions, chart=helm/charts/traefik-additions

FAILED RELEASES:
NAME
traefik-additions
in ./helmfile.yaml: failed processing release traefik-additions: command "/usr/local/bin/helm" exited with non-zero status:

PATH:
  /usr/local/bin/helm

ARGS:
  0: helm (4 bytes)
  1: upgrade (7 bytes)
  2: --install (9 bytes)
  3: --reset-values (14 bytes)
  4: traefik-additions (17 bytes)
  5: helm/charts/traefik-additions (29 bytes)
  6: --timeout (9 bytes)
  7: 1200s (5 bytes)
  8: --atomic (8 bytes)
  9: --cleanup-on-fail (17 bytes)
  10: --namespace (11 bytes)
  11: traefik (7 bytes)
  12: --values (8 bytes)
  13: /tmp/helmfile328956999/traefik-traefik-additions-values-6588698f5c (66 bytes)
  14: --history-max (13 bytes)
  15: 10 (2 bytes)

ERROR:
  exit status 1

EXIT STATUS
  1

STDERR:
  Error: UPGRADE FAILED: an error occurred while rolling back the release. original upgrade error: failed to create resource: admission webhook "validate-policy.kyverno.svc" denied the request: unable to retrieve the complete list of server APIs: external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1: no ServiceAccount with the name "traefik-cron" found

COMBINED OUTPUT:
  Error: UPGRADE FAILED: an error occurred while rolling back the release. original upgrade error: failed to create resource: admission webhook "validate-policy.kyverno.svc" denied the request: unable to retrieve the complete list of server APIs: external.metrics.k8s.io/v1beta1: Got empty response for: external.metrics.k8s.io/v1beta1: no ServiceAccount with the name "traefik-cron" found

Is this the same issue or do I need to raise a new one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants