Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow version before 0.52 to upgrade #1126

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
3bef8a8
when Spec.selector changed, since it is immutable, the old deployment…
pureklkl Sep 24, 2022
a3f8809
add e2e test.
pureklkl Sep 24, 2022
86b7ebb
address lint.
pureklkl Sep 24, 2022
842f720
add newline to end.
pureklkl Sep 24, 2022
b13f969
add newline to end.
pureklkl Sep 24, 2022
c280e66
Merge branch 'main' into allow-version-before-0.52-to-upgrade
pureklkl Sep 26, 2022
302ef45
add delete options to deployment and statefulset.
pureklkl Sep 28, 2022
6b520ce
remove unecessary propagation policy, the default background is worki…
pureklkl Oct 1, 2022
2a6c730
Add unit test.
pureklkl Oct 1, 2022
30fa35a
revert accident change.
pureklkl Oct 1, 2022
dbc4074
remove timeout wait, don't 2 changes in one reconcile cycle.
pureklkl Oct 3, 2022
690a843
Merge branch 'main' into allow-version-before-0.52-to-upgrade
pavolloffay Oct 3, 2022
3c2c6e9
fix lint
pureklkl Oct 3, 2022
ce1b940
fix test, change name to isolate it from other tests.
pureklkl Oct 3, 2022
db8da58
revert accident change.
pureklkl Oct 3, 2022
0b4d703
remove unsed constant.
pureklkl Oct 3, 2022
ce19c23
Merge branch 'main' into allow-version-before-0.52-to-upgrade
pureklkl Oct 5, 2022
06f8a4f
Merge branch 'main' into allow-version-before-0.52-to-upgrade
pureklkl Oct 7, 2022
620f486
Add comment for explaining the upgrading e2e test.
pureklkl Oct 10, 2022
50db127
add e2e upgrade to workflow.
pureklkl Oct 10, 2022
e2889f7
Address comment, adding detail comments for e2e upgrade test.
pureklkl Oct 11, 2022
a10fa97
rever accident change.
pureklkl Oct 11, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/e2e.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,4 +35,4 @@ jobs:
- name: "run tests"
env:
KUBE_VERSION: ${{ matrix.kube-version }}
run: make prepare-e2e e2e KUBE_VERSION=$KUBE_VERSION
run: make prepare-e2e e2e e2e-upgrade KUBE_VERSION=$KUBE_VERSION
5 changes: 5 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,11 @@ generate: controller-gen api-docs
e2e:
$(KUTTL) test

# end-to-end-test for testing upgrading
.PHONY: e2e-upgrade
e2e-upgrade:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the upgrade test executed on the CI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added.

$(KUTTL) test --config kuttl-test-upgrade.yaml

.PHONY: prepare-e2e
prepare-e2e: kuttl set-test-image-vars set-image-controller container container-target-allocator start-kind install-metrics-server load-image-all
mkdir -p tests/_build/crds tests/_build/manifests
Expand Down
26 changes: 26 additions & 0 deletions kuttl-test-upgrade.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Make sure that the OT operator after upgrading itself, can upgrade the OT collectors without error.
# The test is based on the version v0.49.0, a breaking change was introduced from PR
# https://github.com/open-telemetry/opentelemetry-operator/pull/797, which added a version label "app.kubernetes.io/version",
# The version label would change between OT operator upgrade, and since at the time, the collector pod selector was the same
# as this labels, resulted in selector being modified during reconciliation which caused error due to the selector is immutable.
# Please be aware of that the collector labels are changeable in various ways, so this issue may happen in any operator < v0.52.0
# which changed the selector to be a static set of labels.
# The fix for this issue including:
# https://github.com/open-telemetry/opentelemetry-operator/issues/840, make the selector be a static set of labels;
# https://github.com/open-telemetry/opentelemetry-operator/issues/1117, delete the old collector to let the operator
# create a new one when the selector changed.
apiVersion: kuttl.dev/v1beta1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add here some docs that explain why this test is there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comment.

kind: TestSuite
crdDir: ./tests/_build/crds/
artifactsDir: ./tests/_build/artifacts/
kindContainers:
- local/opentelemetry-operator:e2e
- ghcr.io/open-telemetry/opentelemetry-operator/opentelemetry-operator:v0.49.0
commands:
- command: make cert-manager
- command: kubectl apply -f https://github.com/open-telemetry/opentelemetry-operator/releases/download/v0.49.0/opentelemetry-operator.yaml
- command: kubectl rollout status -w deployment/opentelemetry-operator-controller-manager -n opentelemetry-operator-system
- command: sleep 60s
testDirs:
- ./tests/e2e-upgrade/
timeout: 150
14 changes: 11 additions & 3 deletions pkg/collector/reconcile/daemonset.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import (
"fmt"

appsv1 "k8s.io/api/apps/v1"
apiequality "k8s.io/apimachinery/pkg/api/equality"
k8serrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/types"
"sigs.k8s.io/controller-runtime/pkg/client"
Expand Down Expand Up @@ -70,6 +71,16 @@ func expectedDaemonSets(ctx context.Context, params Params, expected []appsv1.Da
return fmt.Errorf("failed to get: %w", err)
}

// Selector is an immutable field, if set, we cannot modify it otherwise we will face reconciliation error.
if !apiequality.Semantic.DeepEqual(desired.Spec.Selector, existing.Spec.Selector) {
params.Log.V(2).Info("Spec.Selector change detected, trying to delete, the new collector daemonset will be created in the next reconcile cycle", "daemonset.name", existing.Name, "daemonset.namespace", existing.Namespace)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When is the next reconcile loop?

Maybe we should be more explicit and force the creation and continue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the operator is watching the delete event, the next loop will happen when the collector got deleted (you can try manually delete the deployment and operator created almost immediately). Normally we shall not do 2 updates at one time, which will generate extra events.


if err := params.Client.Delete(ctx, existing); err != nil {
return fmt.Errorf("failed to request deleting daemonset: %w", err)
}
continue
}

// it exists already, merge the two if the end result isn't identical to the existing one
updated := existing.DeepCopy()
if updated.Annotations == nil {
Expand All @@ -89,9 +100,6 @@ func expectedDaemonSets(ctx context.Context, params Params, expected []appsv1.Da
updated.ObjectMeta.Labels[k] = v
}

// Selector is an immutable field, if set, we cannot modify it otherwise we will face reconciliation error.
updated.Spec.Selector = existing.Spec.Selector.DeepCopy()

patch := client.MergeFrom(existing)
if err := params.Client.Patch(ctx, updated, patch); err != nil {
return fmt.Errorf("failed to apply changes: %w", err)
Expand Down
30 changes: 30 additions & 0 deletions pkg/collector/reconcile/daemonset_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -130,4 +130,34 @@ func TestExpectedDaemonsets(t *testing.T) {
assert.True(t, exists)

})

t.Run("change Spec.Selector should recreate daemonset", func(t *testing.T) {

oldDs := collector.DaemonSet(param.Config, logger, param.Instance)
oldDs.Spec.Selector.MatchLabels["app.kubernetes.io/version"] = "latest"
oldDs.Spec.Template.Labels["app.kubernetes.io/version"] = "latest"
oldDs.Name = "update-ds"

err := expectedDaemonSets(context.Background(), param, []v1.DaemonSet{oldDs})
assert.NoError(t, err)
exists, err := populateObjectIfExists(t, &v1.DaemonSet{}, types.NamespacedName{Namespace: "default", Name: oldDs.Name})
assert.NoError(t, err)
assert.True(t, exists)

newDs := collector.DaemonSet(param.Config, logger, param.Instance)
newDs.Name = oldDs.Name
err = expectedDaemonSets(context.Background(), param, []v1.DaemonSet{newDs})
assert.NoError(t, err)
exists, err = populateObjectIfExists(t, &v1.DaemonSet{}, types.NamespacedName{Namespace: "default", Name: oldDs.Name})
assert.NoError(t, err)
assert.False(t, exists)

err = expectedDaemonSets(context.Background(), param, []v1.DaemonSet{newDs})
assert.NoError(t, err)
actual := v1.DaemonSet{}
exists, err = populateObjectIfExists(t, &actual, types.NamespacedName{Namespace: "default", Name: oldDs.Name})
assert.NoError(t, err)
assert.True(t, exists)
assert.Equal(t, newDs.Spec.Selector.MatchLabels, actual.Spec.Selector.MatchLabels)
})
}
14 changes: 11 additions & 3 deletions pkg/collector/reconcile/deployment.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import (
"fmt"

appsv1 "k8s.io/api/apps/v1"
apiequality "k8s.io/apimachinery/pkg/api/equality"
k8serrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/types"
"sigs.k8s.io/controller-runtime/pkg/client"
Expand Down Expand Up @@ -76,6 +77,16 @@ func expectedDeployments(ctx context.Context, params Params, expected []appsv1.D
return fmt.Errorf("failed to get: %w", err)
}

// Selector is an immutable field, if set, we cannot modify it otherwise we will face reconciliation error.
if !apiequality.Semantic.DeepEqual(desired.Spec.Selector, existing.Spec.Selector) {
params.Log.V(2).Info("Spec.Selector change detected, trying to delete, the new collector deployment will be created in the next reconcile cycle ", "deployment.name", existing.Name, "deployment.namespace", existing.Namespace)

if err := params.Client.Delete(ctx, existing); err != nil {
return fmt.Errorf("failed to delete deployment: %w", err)
}
continue
}

// it exists already, merge the two if the end result isn't identical to the existing one
updated := existing.DeepCopy()
if updated.Annotations == nil {
Expand All @@ -95,9 +106,6 @@ func expectedDeployments(ctx context.Context, params Params, expected []appsv1.D
updated.ObjectMeta.Labels[k] = v
}

// Selector is an immutable field, if set, we cannot modify it otherwise we will face reconciliation error.
updated.Spec.Selector = existing.Spec.Selector.DeepCopy()

patch := client.MergeFrom(existing)

if err := params.Client.Patch(ctx, updated, patch); err != nil {
Expand Down
30 changes: 30 additions & 0 deletions pkg/collector/reconcile/deployment_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -342,6 +342,36 @@ func TestExpectedDeployments(t *testing.T) {
assert.True(t, exists)

})

t.Run("change Spec.Selector should recreate deployment", func(t *testing.T) {

oldDeploy := collector.Deployment(param.Config, logger, param.Instance)
oldDeploy.Spec.Selector.MatchLabels["app.kubernetes.io/version"] = "latest"
oldDeploy.Spec.Template.Labels["app.kubernetes.io/version"] = "latest"
oldDeploy.Name = "update-deploy"

err := expectedDeployments(context.Background(), param, []v1.Deployment{oldDeploy})
assert.NoError(t, err)
exists, err := populateObjectIfExists(t, &v1.Deployment{}, types.NamespacedName{Namespace: "default", Name: oldDeploy.Name})
assert.NoError(t, err)
assert.True(t, exists)

newDeploy := collector.Deployment(param.Config, logger, param.Instance)
newDeploy.Name = oldDeploy.Name
err = expectedDeployments(context.Background(), param, []v1.Deployment{newDeploy})
assert.NoError(t, err)
exists, err = populateObjectIfExists(t, &v1.Deployment{}, types.NamespacedName{Namespace: "default", Name: oldDeploy.Name})
assert.NoError(t, err)
assert.False(t, exists)

err = expectedDeployments(context.Background(), param, []v1.Deployment{newDeploy})
assert.NoError(t, err)
actual := v1.Deployment{}
exists, err = populateObjectIfExists(t, &actual, types.NamespacedName{Namespace: "default", Name: oldDeploy.Name})
assert.NoError(t, err)
assert.True(t, exists)
assert.Equal(t, newDeploy.Spec.Selector.MatchLabels, actual.Spec.Selector.MatchLabels)
})
}

func TestCurrentReplicasWithHPA(t *testing.T) {
Expand Down
14 changes: 11 additions & 3 deletions pkg/collector/reconcile/statefulset.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ import (
"fmt"

appsv1 "k8s.io/api/apps/v1"
apiequality "k8s.io/apimachinery/pkg/api/equality"
k8serrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/types"
"sigs.k8s.io/controller-runtime/pkg/client"
Expand Down Expand Up @@ -71,6 +72,16 @@ func expectedStatefulSets(ctx context.Context, params Params, expected []appsv1.
return fmt.Errorf("failed to get: %w", err)
}

// Selector is an immutable field, if set, we cannot modify it otherwise we will face reconciliation error.
if !apiequality.Semantic.DeepEqual(desired.Spec.Selector, existing.Spec.Selector) {
params.Log.V(2).Info("Spec.Selector change detected, trying to delete, the new collector statfulset will be created in the next reconcile cycle", "statefulset.name", existing.Name, "statefulset.namespace", existing.Namespace)

if err := params.Client.Delete(ctx, existing); err != nil {
return fmt.Errorf("failed to delete statefulset: %w", err)
}
continue
}

// it exists already, merge the two if the end result isn't identical to the existing one
updated := existing.DeepCopy()
if updated.Annotations == nil {
Expand All @@ -90,9 +101,6 @@ func expectedStatefulSets(ctx context.Context, params Params, expected []appsv1.
updated.ObjectMeta.Labels[k] = v
}

// Selector is an immutable field, if set, we cannot modify it otherwise we will face reconciliation error.
updated.Spec.Selector = existing.Spec.Selector.DeepCopy()

patch := client.MergeFrom(existing)
if err := params.Client.Patch(ctx, updated, patch); err != nil {
return fmt.Errorf("failed to apply changes: %w", err)
Expand Down
30 changes: 30 additions & 0 deletions pkg/collector/reconcile/statefulset_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -132,4 +132,34 @@ func TestExpectedStatefulsets(t *testing.T) {
assert.True(t, exists)

})

t.Run("change Spec.Selector should recreate statefulset", func(t *testing.T) {

oldSs := collector.StatefulSet(param.Config, logger, param.Instance)
oldSs.Spec.Selector.MatchLabels["app.kubernetes.io/version"] = "latest"
oldSs.Spec.Template.Labels["app.kubernetes.io/version"] = "latest"
oldSs.Name = "update-ss"

err := expectedStatefulSets(context.Background(), param, []v1.StatefulSet{oldSs})
assert.NoError(t, err)
exists, err := populateObjectIfExists(t, &v1.StatefulSet{}, types.NamespacedName{Namespace: "default", Name: oldSs.Name})
assert.NoError(t, err)
assert.True(t, exists)

newSs := collector.StatefulSet(param.Config, logger, param.Instance)
newSs.Name = oldSs.Name
err = expectedStatefulSets(context.Background(), param, []v1.StatefulSet{newSs})
assert.NoError(t, err)
exists, err = populateObjectIfExists(t, &v1.StatefulSet{}, types.NamespacedName{Namespace: "default", Name: oldSs.Name})
assert.NoError(t, err)
assert.False(t, exists)

err = expectedStatefulSets(context.Background(), param, []v1.StatefulSet{newSs})
assert.NoError(t, err)
actual := v1.StatefulSet{}
exists, err = populateObjectIfExists(t, &actual, types.NamespacedName{Namespace: "default", Name: oldSs.Name})
assert.NoError(t, err)
assert.True(t, exists)
assert.Equal(t, newSs.Spec.Selector.MatchLabels, actual.Spec.Selector.MatchLabels)
})
}
69 changes: 69 additions & 0 deletions tests/e2e-upgrade/upgrade-test/00-assert.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: simplest-collector
annotations:
operatorVersion: "v0.49.0"
spec:
selector:
matchLabels:
app.kubernetes.io/version: latest
status:
readyReplicas: 1
---

apiVersion: v1
kind: Service
metadata:
name: simplest-collector-headless
spec:
ports:
- appProtocol: grpc
name: jaeger-grpc
port: 14250
protocol: TCP
targetPort: 14250
- appProtocol: grpc
name: otlp-grpc
port: 4317
protocol: TCP
targetPort: 4317
- appProtocol: http
name: otlp-http
port: 4318
protocol: TCP
targetPort: 4318
- appProtocol: http
name: otlp-http-legacy
port: 55681
protocol: TCP
targetPort: 4318

---

apiVersion: v1
kind: Service
metadata:
name: simplest-collector
spec:
ports:
- appProtocol: grpc
name: jaeger-grpc
port: 14250
protocol: TCP
targetPort: 14250
- appProtocol: grpc
name: otlp-grpc
port: 4317
protocol: TCP
targetPort: 4317
- appProtocol: http
name: otlp-http
port: 4318
protocol: TCP
targetPort: 4318
- appProtocol: http
name: otlp-http-legacy
port: 55681
protocol: TCP
targetPort: 4318
28 changes: 28 additions & 0 deletions tests/e2e-upgrade/upgrade-test/00-install.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: simplest
annotations:
operatorVersion: "v0.49.0"
spec:
replicas: 1
config: |
receivers:
jaeger:
protocols:
grpc:
otlp:
protocols:
grpc:
http:
processors:

exporters:
logging:

service:
pipelines:
traces:
receivers: [jaeger,otlp]
processors: []
exporters: [logging]
6 changes: 6 additions & 0 deletions tests/e2e-upgrade/upgrade-test/01-upgrade-operator.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: kuttl.dev/v1beta1
kind: TestStep
commands:
- command: kubectl apply -f ../../_build/manifests/01-opentelemetry-operator.yaml
- command: kubectl rollout status -w deployment/opentelemetry-operator-controller-manager -n opentelemetry-operator-system
- command: sleep 60s
8 changes: 8 additions & 0 deletions tests/e2e-upgrade/upgrade-test/02-assert.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: simplest-collector
annotations:
operatorVersion: "latest"
status:
readyReplicas: 2
Loading