Predictable Release Names with the Helm Operator SDK #1094

giacomoguiulfo · 2019-02-12T20:05:57Z

Type of question

General context and help around the operator-sdk.

Question

What did you do?

Deployed a Helm Operator for a chart and then created a CustomResource for it.

What did you expect to see?

The chart resources to be created under the release name ${my-cr-name}

What did you see instead? Under which circumstances?

The chart resources created under a release name with following format ${my-cr-name}-${uid}.

Environment

operator-sdk version:

v0.5.0
Kubernetes version information:

Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", BuildDate:"2018-08-08T16:31:10Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"darwin/amd64"}

Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.3", GitCommit:"a4529464e4629c21224b3d52edfe0ea91b072862", GitTreeState:"clean", BuildDate:"2018-09-09T17:53:03Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/amd64"}
Kubernetes cluster kind:

MetalK8s

Additional context

Is there a way to specify the release name of the chart to be installed by the Helm operator, or a way to omit the uid as part of the release name?

I was hoping to have predictable names in my deployment for reference by other services.

The text was updated successfully, but these errors were encountered:

joelanford · 2019-02-12T21:43:06Z

@giacomoguiulfo Currently it is not possible to define a custom release name, but that seems like a solid use case to me.

We've thought about supporting an annotation on the CR that would allow a user to define their own release name, but since Helm release names cannot be easily changed, we would need to consider how to handle changes to that annotation once a release has been deployed:

Fail the upgrade until the annotation is removed (or changed to match the existing release name)
Uninstall the existing release and reinstall with the new release name.

The safest choice would be option 1. Does that seem reasonable?

giacomoguiulfo · 2019-02-12T22:26:24Z

Thanks for your reply @joelanford!

I agree, option 1 seems very reasonable to me.

If this is something you could work on and have the time for it, that'll be amazing. Otherwise, let me know and I will be happy to contribute.

joelanford · 2019-02-12T23:28:41Z

@giacomoguiulfo Looking at this a little closer, it's not currently straightforward to implement a check for a release name change since we don't have access to the previous CR object during an update.

The deployed release name is currently available in the status, but relying on status to get the current state goes against Kubernetes design principles, which states:

Object status must be 100% reconstructable by observation. Any history kept must be just an optimization and not required for correct operation.

The correct way to solve this may be with a validating admission webhook that forbids changes to that annotation on updates. We don't currently have any examples of adding webhooks to an operator, but that's something we're looking into.

I think we'll have time to work on this, but it may not be finalized until we have a better idea about how we want to integrate webhooks in the SDK in general.

joelanford · 2019-02-12T23:32:15Z

@theishshah ^^ Thought you might be interested since you're looking into the admission controllers and webhooks.

giacomoguiulfo · 2019-02-12T23:50:07Z

@joelanford Couldn't the release name be the metadata.name of the CR?

joelanford · 2019-02-13T18:18:49Z

@giacomoguiulfo I remembered we had some discussion about the release name in operator-framework/helm-app-operator-kit#49 and operator-framework/helm-app-operator-kit#57 before we integrated the Helm support into the SDK, so check that out for background about how we ended up with the current release name.

Perhaps the scenario of two CRs of different resource kinds having with the same name in the same scope is rare enough in practice that it wouldn't be a problem to just use the CR's metadata.name. One gotcha I know of is that changing the default release name in a future release will cause problems for users that upgrade from the current version of the helm-operator with active CRs and releases. I'd have to think about that and experiment to see if there are any other gotchas.

In the meantime, I noticed that the chart best practices doc has a section on resource naming that mentions a workaround for your use case if you control your helm chart templates.

However, there may be cases where it is known that there won’t be naming conflicts from a fixed name. In these cases a fixed name might make it easier for an application to find a resource such as a Service. If the option for fixed names is needed then one way to manage this might be to make the setting of the name explicit by using a service.name value from the values.yaml if provided:
apiVersion: v1
kind: Service
metadata:
  {{- if .Values.service.name }}
    name: {{ .Values.service.name }}
  {{- else }}
    name: {{ template "fullname" . }}
  {{- end }}

Lastly, there may be another way that we can support this use case and still allow different resource kinds to share the same release name. If each CR had its own storage backend that was guaranteed not to overlap with another CR's storage backend, there would be no chance of release name collisions since each CRs' releases would be stored in a separate backend.

We have an open PR that migrates the helm-operator to use Helm's ConfigMap storage backend (see #917), but it may be worth putting that on hold and considering a custom backend implementation instead (see #1100) since there are other benefits if we go that direction (e.g. store state objects in the same namespace as CRs, add owner references to the state objects)

@hasbro17 Thoughts?

giacomoguiulfo · 2019-02-13T23:09:19Z

@joelanford Thanks for your response and all the helpful information. The templating trick you mentioned worked very well for my purposes!

jayunit100 · 2019-03-12T21:10:58Z

Definetly need this as well

joelanford · 2019-03-12T21:18:12Z

@jayunit100 could you explain your use case? Are you able to use the templating workaround I mentioned in my previous comment?

I'd just like to make sure we understand the use cases so we can get the design right when we're in the position to do this.

snelis · 2019-04-10T11:14:35Z

I would also love to be able to control the release name.
The release name with the uid is also used in dependencies, so even tho im able to control the name with that trick in my own charts, for any dependencies i still get names like my-release-eniduuzz7apmkuy4s9kpn37od-memcached

openshift-bot · 2019-07-09T15:16:08Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

gnunn1 · 2019-07-11T13:45:27Z

As another use case, gitlab's helm chart creates a service account, gitlab-shared-secrets, to run a job which is what does the actual installation of the gitlab components into the namespace. It expects this job to run as root and it fails on OpenShift unless you give the SA the anyuid scc.

When installing the chart with tiller, it's easy to manage as I can simply assign the anyuid scc to the service account in advance before I deploy the chart, i.e.:

oc new-project gitlab

oc adm policy add-scc-to-user anyuid -z default
oc adm policy add-scc-to-user anyuid -z gitlab-shared-secrets

oc create secret tls apps-ocplab-com --cert=$TLS_PATH/fullchain.pem --key=$TLS_PATH/privkey.pem

helm upgrade --install gitlab gitlab/gitlab \
  --timeout 600 \
  --set global.hosts.domain=apps.ocplab.com \
  --set global.hosts.externalIP=10.10.10.10 \
  --set [email protected] \
  --set nginx-ingress.enabled=false \
  --set prometheus.install=false \
  --set certmanager.install=false \
  --set global.ingress.configureCertmanager=false \
  --set global.ingress.tls.secretName=apps-ocplab-com

However with the helm-operator the name of this service account is mangled with the uid, so when I deploy I have to watch for the service account to be created and then give the mangled version of the service account the anyuid scc. While certainly some of this rests on gitlab and their continued instance on using root everywhere, It's not a user friendly experience and it would be alleviated with a predictable name.

I will also add that working with deployed artifacts with these mangled names is not pleasant, if there was an option for the user to be able to specify a unique identifier to use instead of the uid it would be much easier since you could pick something memorable.

pgerv12 · 2019-07-19T20:41:28Z

I understand Kubernetes allows for different resources to have the same name but if a Helm Operator is watching more than one resource, I think it would be good to have an annotation, env variable, flag or something to make the Operator act more like Helm and fail an installation if the name already exists -- even though it is a different resource. Then, users could have the options to have predictable names but understand the risk that all resources watched by the Operator need to be uniquely named OR have uniquely generated release names.

joelanford · 2019-07-19T20:54:40Z

The team had some discussion about this today, and the current thinking is that we need to use both mutating and validating admission webhooks to transition to a new helm release name mechanism, allow arbitrary names set by an annotation, and support backwards-compatibility with existing CRs.

The idea is to use an annotation to set the release name:

On creation, the mutating admission webhook will default the release name annotation to the CR name (unless it's already set).
On creation, the validating admission webhook will fail the validation if a release already exists in the same namespace based on the CR's release name annotation.
On update, the validating admission webhook will fail the validation if the CR release name annotation has changed.
For existing CRs that don't have the release name annotation set, the mutating admission webhook will set it to the current auto-generated name.

joelanford · 2019-07-19T20:54:59Z

/remove-lifecycle stale

joelanford · 2019-08-13T04:46:52Z

@dmesser @giacomoguiulfo @jayunit100 @snelis @gnunn1 @pgerv12 I finally had some time to delve into this and I think I've figured out a way to implement a backwards-compatible change from the UID-based release name to one that uses the CR name only.

This also covers the duplicate release name problem by ensuring that an existing release's chart is the same one that is mapped to the GVK being reconciled.

Lastly, by getting the default release name changed right now, this will set us up nicely to move the duplicate checks to create and update time (rather than reconcile time) and to introduce customizations via annotations when we have support for the necessary validating admission webhooks.

Take a look, try it out, and let me know if you see anything I missed: #1818

joelanford added language/helm Issue is related to a Helm operator project kind/feature Categorizes issue or PR as related to a new feature. labels Feb 12, 2019

joelanford self-assigned this Feb 12, 2019

joelanford mentioned this issue Feb 13, 2019

[WIP] pkg/helm: custom storage backend per CR with ownerref #1100

Closed

joelanford mentioned this issue May 30, 2019

Excluding shorten UID from the release name #1501

Closed

ksatchit mentioned this issue Jul 1, 2019

feat(helm-operator) initial commit for openebs helm operator openebs-archive/helm-operator#1

Merged

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 9, 2019

openshift-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 19, 2019

itdove mentioned this issue Aug 1, 2019

metadata.name too long error #1764

Closed

joelanford mentioned this issue Aug 13, 2019

pkg/helm: deprecate UID-based release name, use CR name instead #1818

Merged

joelanford closed this as completed in #1818 Aug 15, 2019

joelanford mentioned this issue Jul 9, 2020

Helm Operator fails when two different types of Custom Resources have same name #3357

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Predictable Release Names with the Helm Operator SDK #1094

Predictable Release Names with the Helm Operator SDK #1094

giacomoguiulfo commented Feb 12, 2019

joelanford commented Feb 12, 2019

giacomoguiulfo commented Feb 12, 2019

joelanford commented Feb 12, 2019 •

edited

Loading

joelanford commented Feb 12, 2019

giacomoguiulfo commented Feb 12, 2019

joelanford commented Feb 13, 2019

giacomoguiulfo commented Feb 13, 2019

jayunit100 commented Mar 12, 2019

joelanford commented Mar 12, 2019

snelis commented Apr 10, 2019 •

edited

Loading

openshift-bot commented Jul 9, 2019

gnunn1 commented Jul 11, 2019

pgerv12 commented Jul 19, 2019 •

edited

Loading

joelanford commented Jul 19, 2019

joelanford commented Jul 19, 2019

joelanford commented Aug 13, 2019

Predictable Release Names with the Helm Operator SDK #1094

Predictable Release Names with the Helm Operator SDK #1094

Comments

giacomoguiulfo commented Feb 12, 2019

Type of question

Question

joelanford commented Feb 12, 2019

giacomoguiulfo commented Feb 12, 2019

joelanford commented Feb 12, 2019 • edited Loading

joelanford commented Feb 12, 2019

giacomoguiulfo commented Feb 12, 2019

joelanford commented Feb 13, 2019

giacomoguiulfo commented Feb 13, 2019

jayunit100 commented Mar 12, 2019

joelanford commented Mar 12, 2019

snelis commented Apr 10, 2019 • edited Loading

openshift-bot commented Jul 9, 2019

gnunn1 commented Jul 11, 2019

pgerv12 commented Jul 19, 2019 • edited Loading

joelanford commented Jul 19, 2019

joelanford commented Jul 19, 2019

joelanford commented Aug 13, 2019

joelanford commented Feb 12, 2019 •

edited

Loading

snelis commented Apr 10, 2019 •

edited

Loading

pgerv12 commented Jul 19, 2019 •

edited

Loading