Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change Horizontal Pod Autoscaler to scale on OpenTelemetry Collector … #984

Merged
merged 5 commits into from
Jul 25, 2022
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ e2e:
$(KUTTL) test

.PHONY: prepare-e2e
prepare-e2e: kuttl set-test-image-vars set-image-controller container container-target-allocator start-kind load-image-all
prepare-e2e: kuttl set-test-image-vars set-image-controller container container-target-allocator start-kind install-metrics-server load-image-all
mkdir -p tests/_build/crds tests/_build/manifests
$(KUSTOMIZE) build config/default -o tests/_build/manifests/01-opentelemetry-operator.yaml
$(KUSTOMIZE) build config/crd -o tests/_build/crds/
Expand Down Expand Up @@ -184,6 +184,10 @@ container-target-allocator:
start-kind:
kind create cluster --config $(KIND_CONFIG)

.PHONY: install-metrics-server
install-metrics-server:
./hack/install-metrics-server.sh

.PHONY: load-image-all
load-image-all: load-image-operator load-image-target-allocator

Expand Down
6 changes: 5 additions & 1 deletion apis/v1alpha1/opentelemetrycollector_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,14 @@ type OpenTelemetryCollectorSpec struct {
// +optional
Args map[string]string `json:"args,omitempty"`

// Replicas is the number of pod instances for the underlying OpenTelemetry Collector
// Replicas is the number of pod instances for the underlying OpenTelemetry Collector. Set this if your are not using autoscaling
// +optional
Replicas *int32 `json:"replicas,omitempty"`

// MinReplicas sets a lower bound to the autoscaling feature. Set this if your are using autoscaling. It must be at least 1
// +optional
MinReplicas *int32 `json:"minReplicas,omitempty"`
kevinearls marked this conversation as resolved.
Show resolved Hide resolved

// MaxReplicas sets an upper bound to the autoscaling feature. If MaxReplicas is set autoscaling is enabled.
// +optional
MaxReplicas *int32 `json:"maxReplicas,omitempty"`
Expand Down
9 changes: 9 additions & 0 deletions apis/v1alpha1/opentelemetrycollector_webhook.go
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,15 @@ func (r *OpenTelemetryCollector) validateCRDSpec() error {
if r.Spec.Replicas != nil && *r.Spec.Replicas > *r.Spec.MaxReplicas {
return fmt.Errorf("the OpenTelemetry Spec autoscale configuration is incorrect, replicas must not be greater than maxReplicas")
}

if r.Spec.MinReplicas != nil && *r.Spec.MinReplicas > *r.Spec.MaxReplicas {
return fmt.Errorf("the OpenTelemetry Spec autoscale configuration is incorrect, minReplicas must not be greater than maxReplicas")
}

if r.Spec.MinReplicas != nil && *r.Spec.MinReplicas < int32(1) {
return fmt.Errorf("the OpenTelemetry Spec autoscale configuration is incorrect, minReplicas should be one or more")
}

}

return nil
Expand Down
5 changes: 5 additions & 0 deletions apis/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,11 @@ spec:
If MaxReplicas is set autoscaling is enabled.
format: int32
type: integer
minReplicas:
description: MinReplicas sets a lower bound to the autoscaling feature. Set
this if your are using autoscaling. It must be at least 1
format: int32
type: integer
mode:
description: Mode represents how the collector should be deployed
(deployment, daemonset, statefulset or sidecar)
Expand Down Expand Up @@ -475,7 +480,7 @@ spec:
x-kubernetes-list-type: atomic
replicas:
description: Replicas is the number of pod instances for the underlying
OpenTelemetry Collector
OpenTelemetry Collector. Set this if your are not using autoscaling
format: int32
type: integer
resources:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -221,6 +221,11 @@ spec:
If MaxReplicas is set autoscaling is enabled.
format: int32
type: integer
minReplicas:
description: MinReplicas sets a lower bound to the autoscaling feature. Set
kevinearls marked this conversation as resolved.
Show resolved Hide resolved
this if your are using autoscaling. It must be at least 1
format: int32
type: integer
mode:
description: Mode represents how the collector should be deployed
(deployment, daemonset, statefulset or sidecar)
Expand Down Expand Up @@ -473,7 +478,7 @@ spec:
x-kubernetes-list-type: atomic
replicas:
description: Replicas is the number of pod instances for the underlying
OpenTelemetry Collector
OpenTelemetry Collector. Set this if your are not using autoscaling
format: int32
type: integer
resources:
Expand Down
11 changes: 10 additions & 1 deletion docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -1455,6 +1455,15 @@ OpenTelemetryCollectorSpec defines the desired state of OpenTelemetryCollector.
<i>Format</i>: int32<br/>
</td>
<td>false</td>
</tr><tr>
<td><b>minReplicas</b></td>
<td>integer</td>
<td>
MinReplicas sets a lower bound to the autoscaling feature. Set this if your are using autoscaling. It must be at least 1<br/>
<br/>
<i>Format</i>: int32<br/>
</td>
<td>false</td>
</tr><tr>
<td><b>mode</b></td>
<td>enum</td>
Expand Down Expand Up @@ -1496,7 +1505,7 @@ OpenTelemetryCollectorSpec defines the desired state of OpenTelemetryCollector.
<td><b>replicas</b></td>
<td>integer</td>
<td>
Replicas is the number of pod instances for the underlying OpenTelemetry Collector<br/>
Replicas is the number of pod instances for the underlying OpenTelemetry Collector. Set this if your are not using autoscaling<br/>
<br/>
<i>Format</i>: int32<br/>
</td>
Expand Down
7 changes: 7 additions & 0 deletions hack/install-metrics-server.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
#!/bin/bash

# Install metrics-server on kind clusters for autoscale tests. Note: This is not needed for minikube,
# you can just add --addons "metrics-server" to the start command.
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
kubectl patch deployment -n kube-system metrics-server --type "json" -p '[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": --kubelet-insecure-tls}]'
kubectl wait --for=condition=available deployment/metrics-server -n kube-system --timeout=5m
9 changes: 5 additions & 4 deletions pkg/collector/horizontalpodautoscaler.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,17 +35,18 @@ func HorizontalPodAutoscaler(cfg config.Config, logger logr.Logger, otelcol v1al

return autoscalingv1.HorizontalPodAutoscaler{
ObjectMeta: metav1.ObjectMeta{
Name: naming.Collector(otelcol),
Name: naming.HorizontalPodAutoscaler(otelcol),
Namespace: otelcol.Namespace,
Labels: labels,
Annotations: annotations,
},
Spec: autoscalingv1.HorizontalPodAutoscalerSpec{
ScaleTargetRef: autoscalingv1.CrossVersionObjectReference{
APIVersion: "apps/v1",
Kind: "Deployment",
Name: naming.Collector(otelcol),
APIVersion: v1alpha1.GroupVersion.String(),
Kind: "OpenTelemetryCollector",
Name: naming.OpenTelemetryCollector(otelcol),
},

MinReplicas: otelcol.Spec.Replicas,
MaxReplicas: *otelcol.Spec.MaxReplicas,
TargetCPUUtilizationPercentage: &cpuTarget,
Expand Down
5 changes: 0 additions & 5 deletions pkg/collector/reconcile/deployment.go
Original file line number Diff line number Diff line change
Expand Up @@ -99,11 +99,6 @@ func expectedDeployments(ctx context.Context, params Params, expected []appsv1.D
updated.ObjectMeta.Labels[k] = v
}

if params.Instance.Spec.MaxReplicas != nil && desired.Labels["app.kubernetes.io/component"] == "opentelemetry-collector" {
currentReplicas := currentReplicasWithHPA(params.Instance.Spec, existing.Status.Replicas)
updated.Spec.Replicas = &currentReplicas
}

// Selector is an immutable field, if set, we cannot modify it otherwise we will face reconciliation error.
updated.Spec.Selector = existing.Spec.Selector.DeepCopy()

Expand Down
7 changes: 6 additions & 1 deletion pkg/collector/reconcile/horizontalpodautoscaler.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ func HorizontalPodAutoscalers(ctx context.Context, params Params) error {
}

func expectedHorizontalPodAutoscalers(ctx context.Context, params Params, expected []autoscalingv1.HorizontalPodAutoscaler) error {
one := int32(1)
for _, obj := range expected {
desired := obj

Expand Down Expand Up @@ -81,9 +82,13 @@ func expectedHorizontalPodAutoscalers(ctx context.Context, params Params, expect
}

updated.OwnerReferences = desired.OwnerReferences
updated.Spec.MinReplicas = params.Instance.Spec.Replicas
if params.Instance.Spec.MaxReplicas != nil {
updated.Spec.MaxReplicas = *params.Instance.Spec.MaxReplicas
if params.Instance.Spec.MinReplicas != nil {
updated.Spec.MinReplicas = params.Instance.Spec.MinReplicas
} else {
updated.Spec.MinReplicas = &one
}
}

for k, v := range desired.Annotations {
Expand Down
74 changes: 74 additions & 0 deletions pkg/collector/upgrade/v0_56_0.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package upgrade

import (
"context"
"fmt"

autoscalingv1 "k8s.io/api/autoscaling/v1"
"sigs.k8s.io/controller-runtime/pkg/client"

"github.com/open-telemetry/opentelemetry-operator/apis/v1alpha1"
"github.com/open-telemetry/opentelemetry-operator/pkg/naming"
)

func upgrade0_56_0(u VersionUpgrade, otelcol *v1alpha1.OpenTelemetryCollector) (*v1alpha1.OpenTelemetryCollector, error) {
// return if this does not use an autoscaler
if otelcol.Spec.MaxReplicas == nil {
return otelcol, nil
}

// Add minReplicas
one := int32(1)
otelcol.Spec.MinReplicas = &one

// Find the existing HPA for this collector and upgrade it if necessary
listOptions := []client.ListOption{
client.InNamespace(otelcol.Namespace),
client.MatchingLabels(map[string]string{
"app.kubernetes.io/instance": fmt.Sprintf("%s.%s", otelcol.Namespace, otelcol.Name),
"app.kubernetes.io/managed-by": "opentelemetry-operator",
}),
}

hpaList := &autoscalingv1.HorizontalPodAutoscalerList{}
ctx := context.Background()
if err := u.Client.List(ctx, hpaList, listOptions...); err != nil {
return nil, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should wrap the error to make it clear what operation failed

}

for i := range hpaList.Items {
existing := hpaList.Items[i]
// If there is an autoscaler based on Deployment, replace it with one based on OpenTelemetryCollector
if existing.Spec.ScaleTargetRef.Kind == "Deployment" {
updated := existing.DeepCopy()
updated.Spec.ScaleTargetRef = autoscalingv1.CrossVersionObjectReference{
Kind: "OpenTelemetryCollector",
Name: naming.OpenTelemetryCollectorName(otelcol.Name),
APIVersion: v1alpha1.GroupVersion.String(),
}
patch := client.MergeFrom(&existing)
err := u.Client.Patch(ctx, updated, patch)
if err != nil {
return nil, err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could continue on an error and then return a multierror at the end wrapped with a message that the upgrade of HPA was not successful.

Either way the upgrade will have to be restarted or done manually.

}
}
}

u.Log.Info("in upgrade0_56_0", "Otel Instance", otelcol.Name, "Upgrade version", u.Version.String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC the other upgrade procedures use event API to record the "log" messages for the user.


return otelcol, nil
}
59 changes: 59 additions & 0 deletions pkg/collector/upgrade/v0_56_0_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package upgrade_test

import (
"context"
"testing"

"github.com/stretchr/testify/assert"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/client-go/tools/record"

"github.com/open-telemetry/opentelemetry-operator/apis/v1alpha1"
"github.com/open-telemetry/opentelemetry-operator/internal/version"
"github.com/open-telemetry/opentelemetry-operator/pkg/collector/upgrade"
)

func Test0_56_0Upgrade(t *testing.T) {
one := int32(1)
three := int32(3)

collectorInstance := v1alpha1.OpenTelemetryCollector{
TypeMeta: metav1.TypeMeta{
Kind: "OpenTelemetryCollector",
APIVersion: "v1alpha1",
},
ObjectMeta: metav1.ObjectMeta{
Name: "my-instance",
Namespace: "somewhere",
},
Spec: v1alpha1.OpenTelemetryCollectorSpec{
Replicas: &one,
MaxReplicas: &three,
},
}

collectorInstance.Status.Version = "0.55.0"
versionUpgrade := &upgrade.VersionUpgrade{
Log: logger,
Version: version.Get(),
Client: k8sClient,
Recorder: record.NewFakeRecorder(upgrade.RecordBufferSize),
}
upgradedInstance, err := versionUpgrade.ManagedInstance(context.Background(), collectorInstance)
assert.NoError(t, err)
assert.Equal(t, one, *upgradedInstance.Spec.MinReplicas)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test should as well assert that the HPA object was changed accordingly.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is possible as upgrade get the HPA by calling the k8sclient. It's not in the object that is passed in:

https://jenkins-cpaas-distributed-tracing.apps.cpaas-poc.r6c9.p1.openshiftapps.com/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be possible the test can access the k8sClient and create the objects before running the upgrade procedure.

}
4 changes: 4 additions & 0 deletions pkg/collector/upgrade/versions.go
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,10 @@ var (
Version: *semver.MustParse("0.43.0"),
upgrade: upgrade0_43_0,
},
{
Version: *semver.MustParse("0.56.0"),
upgrade: upgrade0_56_0,
},
}

// Latest represents the latest version that we need to upgrade. This is not necessarily the latest known version.
Expand Down
15 changes: 15 additions & 0 deletions pkg/naming/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,21 @@ func Collector(otelcol v1alpha1.OpenTelemetryCollector) string {
return DNSName(Truncate("%s-collector", 63, otelcol.Name))
}

// HorizontalPodAutoscaler builds the autoscaler name based on the instance.
func HorizontalPodAutoscaler(otelcol v1alpha1.OpenTelemetryCollector) string {
return DNSName(Truncate("%s-collector", 63, otelcol.Name))
}

// HorizontalPodAutoscaler builds the collector (deployment/daemonset) name based on the instance.
func OpenTelemetryCollector(otelcol v1alpha1.OpenTelemetryCollector) string {
return DNSName(Truncate("%s", 63, otelcol.Name))
}

// HorizontalPodAutoscaler builds the collector (deployment/daemonset) name based on the instance.
func OpenTelemetryCollectorName(otelcolName string) string {
return DNSName(Truncate("%s", 63, otelcolName))
}

// TargetAllocator returns the TargetAllocator deployment resource name.
func TargetAllocator(otelcol v1alpha1.OpenTelemetryCollector) string {
return DNSName(Truncate("%s-targetallocator", 63, otelcol.Name))
Expand Down
16 changes: 16 additions & 0 deletions tests/e2e/autoscale/00-assert.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: simplest-collector
status:
readyReplicas: 1

---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
name: simplest-collector
spec:
minReplicas: 1
maxReplicas: 2

Loading