Skip to content

Commit

Permalink
Add OpenTelemetryCollector.Spec.UpgradeStrategy with allowable values…
Browse files Browse the repository at this point in the history
…: 'automatic', 'none'

fixes open-telemetry#598

Signed-off-by: Adrian Kostrubiak [email protected]

lint

Add read permissions to other users for instrumentation files (open-telemetry#622)

* Add read permissions to other users for instrumentation files

Signed-off-by: Pavol Loffay <[email protected]>

* revert

Signed-off-by: Pavol Loffay <[email protected]>

* Fix

Signed-off-by: Pavol Loffay <[email protected]>

PR feedback; adjust test suites to start up / spin down webhook server so that defaulting works as expected.

TODO for another time to extract much of the duplicated logic across the varying suite_test.go TestMain funcs
  • Loading branch information
adriankostrubiak-tomtom committed Dec 9, 2021
1 parent 5cf7e24 commit 11d7f9e
Show file tree
Hide file tree
Showing 13 changed files with 459 additions and 70 deletions.
19 changes: 19 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,25 @@ The `config` node holds the `YAML` that should be passed down as-is to the under

At this point, the Operator does *not* validate the contents of the configuration file: if the configuration is invalid, the instance will still be created but the underlying OpenTelemetry Collector might crash.


### Upgrades

As noted above, the OpenTelemetry Collector format is continuing to evolve. However, a best-effort attempt is made to upgrade all managed `OpenTelemetryCollector` resources.

In certain scenarios, it may be desirable to prevent the operator from upgrading certain `OpenTelemetryCollector` resources. For example, when a resource is configured with a custom `.Spec.Image`, end users may wish to manage configuration themselves as opposed to having the operator upgrade it. This can be configured on a resource by resource basis with the exposed property `.Spec.UpgradeStrategy`.

By configuring a resource's `.Spec.UpgradeStrategy` to `none`, the operator will skip the given instance during the upgrade routine.

The default and only other acceptable value for `.Spec.UpgradeStrategy` is `automatic`.


### `opentelemetry-operator` vs `OpenTelemetryCollector` Versioning

By default, the operator ensures consistent versioning between itself and the managed `OpenTelemetryCollector` resources. That is, if the operator is based on version `0.40.0`, it will create resources with an underlying core `opentelemetry-collector` at version `0.40.0`.

When a custom `Spec.Image` is used, the operator will not manage this versioning and upgrading. In this scenario, it is best practice that the operator version should match the underlying core version. Given a `OpenTelemetryCollector` resource with a `Spec.Image` configured to a custom image based on underlying core `opentelemetry-collector` at version `0.40.0`, it is recommended that the operator is kept at version `0.40.0`.


### Deployment modes

The `CustomResource` for the `OpenTelemetryCollector` exposes a property named `.Spec.Mode`, which can be used to specify whether the collector should run as a `DaemonSet`, `Sidecar`, or `Deployment` (default). Look at [this sample](https://github.com/open-telemetry/opentelemetry-operator/blob/main/tests/e2e/daemonset-features/00-install.yaml) for reference.
Expand Down
4 changes: 4 additions & 0 deletions apis/v1alpha1/opentelemetrycollector_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,10 @@ type OpenTelemetryCollectorSpec struct {
// +required
Config string `json:"config,omitempty"`

// UpgradeStrategy represents how the operator will handle upgrades to the CR when a newer version of the operator is deployed
// +optional
UpgradeStrategy UpgradeStrategy `json:"upgradeStrategy"`

// Args is the set of arguments to pass to the OpenTelemetry Collector binary
// +optional
Args map[string]string `json:"args,omitempty"`
Expand Down
3 changes: 3 additions & 0 deletions apis/v1alpha1/opentelemetrycollector_webhook.go
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,9 @@ func (r *OpenTelemetryCollector) Default() {
if len(r.Spec.Mode) == 0 {
r.Spec.Mode = ModeDeployment
}
if len(r.Spec.UpgradeStrategy) == 0 {
r.Spec.UpgradeStrategy = UpgradeStrategyAutomatic
}

if r.Labels == nil {
r.Labels = map[string]string{}
Expand Down
29 changes: 29 additions & 0 deletions apis/v1alpha1/upgrade_strategy.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package v1alpha1

type (
// UpgradeStrategy represents how the operator will handle upgrades to the CR when a newer version of the operator is deployed
// +kubebuilder:validation:Enum=automatic;none
UpgradeStrategy string
)

const (
// UpgradeStrategyAutomatic specifies that the operator will automatically apply upgrades to the CR.
UpgradeStrategyAutomatic UpgradeStrategy = "automatic"

// UpgradeStrategyNone specifies that the operator will not apply any upgrades to the CR.
UpgradeStrategyNone UpgradeStrategy = "none"
)
Original file line number Diff line number Diff line change
Expand Up @@ -689,6 +689,13 @@ spec:
type: string
type: object
type: array
upgradeStrategy:
description: UpgradeStrategy represents how the operator will handle
upgrades to the CR when a newer version of the operator is deployed
enum:
- automatic
- none
type: string
volumeClaimTemplates:
description: VolumeClaimTemplates will provide stable storage using
PersistentVolumes. Only available when the mode=statefulset.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -690,6 +690,13 @@ spec:
type: string
type: object
type: array
upgradeStrategy:
description: UpgradeStrategy represents how the operator will handle
upgrades to the CR when a newer version of the operator is deployed
enum:
- automatic
- none
type: string
volumeClaimTemplates:
description: VolumeClaimTemplates will provide stable storage using
PersistentVolumes. Only available when the mode=statefulset.
Expand Down
84 changes: 80 additions & 4 deletions controllers/suite_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,29 +15,46 @@
package controllers_test

import (
"context"
"crypto/tls"
"fmt"
"net"
"os"
"path/filepath"
"sync"
"testing"
"time"

"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/client-go/kubernetes/scheme"
"k8s.io/client-go/util/retry"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/envtest"

"github.com/open-telemetry/opentelemetry-operator/apis/v1alpha1"
// +kubebuilder:scaffold:imports
)

var k8sClient client.Client
var testEnv *envtest.Environment
var testScheme *runtime.Scheme = scheme.Scheme
var (
k8sClient client.Client
testEnv *envtest.Environment
testScheme *runtime.Scheme = scheme.Scheme
ctx context.Context
cancel context.CancelFunc
)

func TestMain(m *testing.M) {
ctx, cancel = context.WithCancel(context.TODO())
defer cancel()

testEnv = &envtest.Environment{
CRDDirectoryPaths: []string{filepath.Join("..", "config", "crd", "bases")},
WebhookInstallOptions: envtest.WebhookInstallOptions{
Paths: []string{filepath.Join("..", "config", "webhook")},
},
}

cfg, err := testEnv.Start()
if err != nil {
fmt.Printf("failed to start testEnv: %v", err)
Expand All @@ -56,6 +73,65 @@ func TestMain(m *testing.M) {
os.Exit(1)
}

// start webhook server using Manager
webhookInstallOptions := &testEnv.WebhookInstallOptions
mgr, err := ctrl.NewManager(cfg, ctrl.Options{
Scheme: testScheme,
Host: webhookInstallOptions.LocalServingHost,
Port: webhookInstallOptions.LocalServingPort,
CertDir: webhookInstallOptions.LocalServingCertDir,
LeaderElection: false,
MetricsBindAddress: "0",
})
if err != nil {
fmt.Printf("failed to start webhook server: %v", err)
os.Exit(1)
}

if err := (&v1alpha1.OpenTelemetryCollector{}).SetupWebhookWithManager(mgr); err != nil {
fmt.Printf("failed to SetupWebhookWithManager: %v", err)
os.Exit(1)
}

ctx, cancel = context.WithCancel(context.TODO())
defer cancel()
go func() {
if err = mgr.Start(ctx); err != nil {
fmt.Printf("failed to start manager: %v", err)
os.Exit(1)
}
}()

// wait for the webhook server to get ready
wg := &sync.WaitGroup{}
wg.Add(1)
dialer := &net.Dialer{Timeout: time.Second}
addrPort := fmt.Sprintf("%s:%d", webhookInstallOptions.LocalServingHost, webhookInstallOptions.LocalServingPort)
go func(wg *sync.WaitGroup) {
defer wg.Done()
if err = retry.OnError(wait.Backoff{
Steps: 20,
Duration: 10 * time.Millisecond,
Factor: 1.5,
Jitter: 0.1,
Cap: time.Second * 30,
}, func(error) bool {
return true
}, func() error {
// #nosec G402
conn, err := tls.DialWithDialer(dialer, "tcp", addrPort, &tls.Config{InsecureSkipVerify: true})
if err != nil {
return err
}
_ = conn.Close()
return nil
}); err != nil {
fmt.Printf("failed to wait for webhook server to be ready: %v", err)
os.Exit(1)
}
}(wg)
wg.Wait()

code := m.Run()

err = testEnv.Stop()
Expand Down
9 changes: 9 additions & 0 deletions docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -508,6 +508,15 @@ OpenTelemetryCollectorSpec defines the desired state of OpenTelemetryCollector.
Toleration to schedule OpenTelemetry Collector pods. This is only relevant to daemonsets, statefulsets and deployments<br/>
</td>
<td>false</td>
</tr><tr>
<td><b>upgradeStrategy</b></td>
<td>enum</td>
<td>
UpgradeStrategy represents how the operator will handle upgrades to the CR when a newer version of the operator is deployed<br/>
<br/>
<i>Enum</i>: automatic, none<br/>
</td>
<td>false</td>
</tr><tr>
<td><b><a href="#opentelemetrycollectorspecvolumeclaimtemplatesindex">volumeClaimTemplates</a></b></td>
<td>[]object</td>
Expand Down
84 changes: 80 additions & 4 deletions internal/webhookhandler/webhookhandler_suite_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,29 +15,46 @@
package webhookhandler_test

import (
"context"
"crypto/tls"
"fmt"
"net"
"os"
"path/filepath"
"sync"
"testing"
"time"

"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/client-go/kubernetes/scheme"
"k8s.io/client-go/util/retry"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/envtest"

"github.com/open-telemetry/opentelemetry-operator/apis/v1alpha1"
// +kubebuilder:scaffold:imports
)

var k8sClient client.Client
var testEnv *envtest.Environment
var testScheme *runtime.Scheme = scheme.Scheme
var (
k8sClient client.Client
testEnv *envtest.Environment
testScheme *runtime.Scheme = scheme.Scheme
ctx context.Context
cancel context.CancelFunc
)

func TestMain(m *testing.M) {
ctx, cancel = context.WithCancel(context.TODO())
defer cancel()

testEnv = &envtest.Environment{
CRDDirectoryPaths: []string{filepath.Join("..", "..", "config", "crd", "bases")},
WebhookInstallOptions: envtest.WebhookInstallOptions{
Paths: []string{filepath.Join("..", "..", "config", "webhook")},
},
}

cfg, err := testEnv.Start()
if err != nil {
fmt.Printf("failed to start testEnv: %v", err)
Expand All @@ -56,6 +73,65 @@ func TestMain(m *testing.M) {
os.Exit(1)
}

// start webhook server using Manager
webhookInstallOptions := &testEnv.WebhookInstallOptions
mgr, err := ctrl.NewManager(cfg, ctrl.Options{
Scheme: testScheme,
Host: webhookInstallOptions.LocalServingHost,
Port: webhookInstallOptions.LocalServingPort,
CertDir: webhookInstallOptions.LocalServingCertDir,
LeaderElection: false,
MetricsBindAddress: "0",
})
if err != nil {
fmt.Printf("failed to start webhook server: %v", err)
os.Exit(1)
}

if err := (&v1alpha1.OpenTelemetryCollector{}).SetupWebhookWithManager(mgr); err != nil {
fmt.Printf("failed to SetupWebhookWithManager: %v", err)
os.Exit(1)
}

ctx, cancel = context.WithCancel(context.TODO())
defer cancel()
go func() {
if err = mgr.Start(ctx); err != nil {
fmt.Printf("failed to start manager: %v", err)
os.Exit(1)
}
}()

// wait for the webhook server to get ready
wg := &sync.WaitGroup{}
wg.Add(1)
dialer := &net.Dialer{Timeout: time.Second}
addrPort := fmt.Sprintf("%s:%d", webhookInstallOptions.LocalServingHost, webhookInstallOptions.LocalServingPort)
go func(wg *sync.WaitGroup) {
defer wg.Done()
if err = retry.OnError(wait.Backoff{
Steps: 20,
Duration: 10 * time.Millisecond,
Factor: 1.5,
Jitter: 0.1,
Cap: time.Second * 30,
}, func(error) bool {
return true
}, func() error {
// #nosec G402
conn, err := tls.DialWithDialer(dialer, "tcp", addrPort, &tls.Config{InsecureSkipVerify: true})
if err != nil {
return err
}
_ = conn.Close()
return nil
}); err != nil {
fmt.Printf("failed to wait for webhook server to be ready: %v", err)
os.Exit(1)
}
}(wg)
wg.Wait()

code := m.Run()

err = testEnv.Stop()
Expand Down
Loading

0 comments on commit 11d7f9e

Please sign in to comment.