Kubernetes Collection 1.0.0 - Breaking Changes

Helm Users
Non-Helm Users
- Changes
- How to upgrade
  - 1. Tear down existing collection resources
  - 2. Deploy New Resources
Kubernetes App dashboard update

Based on the feedback from our users, we will be introducing several changes to the Sumo Logic Kubernetes Collection solution. Here we detail the changes for both Helm and Non-Helm users, as well as the exact steps for migration.

Helm Users

Changes

Falco installation disabled by Default. If you want to enable Falco, modify the enabled flag for Falco in values.yaml as shown below:

falco:
  ## Set the enabled flag to false to disable falco.
  enabled: true

Bumped the helm Falco chart version to v1.1.6 which included a fix to disable the bitcoin/crypto miner rule by default
Changes in Configuration Parameters
- The values.yaml file has had several configs moved and renamed to improve usability. Namely, we introduced a new fluentd section into which we moved all of the Fluentd specific configs, while configs for our dependency charts (prometheus-operator, fluent-bit, metrics-server, falco) have not changed.

Old Config	New Config
sumologic.eventCollectionEnabled	fluentd.events.enabled
sumologic.events.sourceCategory	fluentd.events.sourceCategory
sumologic.logFormat	fluentd.logs.output.logFormat
sumologic.flushInterval	fluentd.buffer.flushInterval
sumologic.numThreads	fluentd.buffer.numThreads
sumologic.chunkLimitSize	fluentd.buffer.chunkLimitSize
sumologic.queueChunkLimitSize	fluentd.buffer.queueChunkLimitSize
sumologic.totalLimitSize	fluentd.buffer.totalLimitSize
sumologic.sourceName	fluentd.logs.containers.sourceName
sumologic.sourceCategory	fluentd.logs.containers.sourceCategory
sumologic.sourceCategoryPrefix	fluentd.logs.containers.sourceCategoryPrefix
sumologic.sourceCategoryReplaceDash	fluentd.logs.containers.sourceCategoryReplaceDash
sumologic.addTimestamp	fluentd.logs.output.addTimestamp
sumologic.timestampKey	fluentd.logs.output.timestampKey
sumologic.verifySsl	fluentd.verifySsl
sumologic.excludeContainerRegex	fluentd.logs.containers.excludeContainerRegex
sumologic.excludeHostRegex	fluentd.logs.containers.excludeHostRegex
sumologic.excludeNamespaceRegex	fluentd.logs.containers.excludeNamespaceRegex
sumologic.excludePodRegex	fluentd.logs.containers.excludePodRegex
sumologic.fluentdLogLevel	fluentd.logLevel
sumologic.watchResourceEventsOverrides	fluentd.events.watchResourceEventsOverrides
sumologic.fluentd.buffer	fluentd.buffer.type
sumologic.fluentd.autoscaling.*	fluentd.logs.autoscaling.* , fluentd.metrics.autoscaling.*
sumologic.k8sMetadataFilter.watch	fluentd.logs.containers.k8sMetadataFilter.watch
sumologic.k8sMetadataFilter.verifySsl	fluentd.logs.containers.k8sMetadataFilter.verifySsl
sumologic.k8sMetadataFilter.cacheSize	fluentd.metadata.cacheSize
sumologic.k8sMetadataFilter.cacheTtl	fluentd.metadata.cacheTtl
sumologic.k8sMetadataFilter.cacheRefresh	fluentd.metadata.cacheRefresh
deployment.*	fluentd.logs.statefulset.* , fluentd.metrics.statefulset.*
eventsDeployment.*	fluentd.eventsStatefulset.*

sumologic.kubernetesMeta and sumologic.kubernetesMetaReduce have been removed. The default log format (fluentd.logs.output.logFormat) is fields, which removes the relevant metadata from the JSON body of the logs, making these configs no longer necessary.
sumologic.addStream and sumologic.addTime (default values were true) have been removed; the default behavior will remain the same. To preserve the behavior of addStream = false or addTime = false, you can add the following config to the values.yaml file:

fluentd:
  logs:
    containers:
      extraFilterPluginConf:
        <filter **>
          @type record_modifier
          remove_keys stream, time
        </filter>

Until now, Helm users have not been able to modify their Fluentd configuration outside of the specific parameters that we exposed in the values.yaml file. Now, we expose the ability to modify the Fluentd configuration as needed.

Some use-cases include :

custom log pipelines,
adding Fluentd filter plugins (ex: fluentd throttle plugin), or
adding Fluentd output plugins (ex: forward to both Sumo and S3)

You can look for example configurations here

The Fluentd deployments have been changed to statefulsets to support the use of persistent volumes. This will allow better buffering behavior. They also now include “fluentd” in their names. This is not a breaking change for Helm users.

The unified Fluentd statefulsets have been split into set of two different Fluentd's, one for logs and the other one for metrics.

How to upgrade

Note: The below steps are using Helm 2. Helm 3 is not supported.

1. Upgrade to helm chart version `v0.17.4`

Run the below command to fetch the latest helm chart:

helm repo update

For the users who are not already on v0.17.4 of the helm chart, please upgrade to that version first by running the below command.

helm upgrade collection sumologic/sumologic --reuse-values --version=0.17.4

2: Run upgrade script

For Helm users, the only breaking changes are the renamed config parameters. For users who use a values.yaml file, we provide a script that users can run to convert their existing values.yaml file into one that is compatible with the major release.

Get the existing values for the helm chart and store it as current_values.yaml with the below command:

helm get values <RELEASE-NAME> > current_values.yaml

Run curl the upgrade script as follows:

curl -LJO https://raw.githubusercontent.com/SumoLogic/sumologic-kubernetes-collection/release-v1.0/deploy/helm/sumologic/upgrade-1.0.0.sh

Run the upgrade script on the above file with the below command.

./upgrade-1.0.0.sh current_values.yaml

At this point, users can then run:

helm upgrade collection sumologic/sumologic --version=1.0.0 -f new_values.yaml

Troubleshooting Upgrade

If you receive the below error, it likely means your OS is picking up an older version of bash even though you may have upgraded. Makes sure you are running a version of bash >= 4.4 by running bash --version. If the version of bash is correct, you can rerun the upgrade script by running bash upgrade-1.0.0.sh current_values.yaml and then rerun helm upgrade collection sumologic/sumologic --version=1.0.0 -f new_values.yaml to resolve.

Error: UPGRADE FAILED: error validating "": error validating data: [ValidationError(StatefulSet.spec.template.spec.containers[0].resources.limits.cpu fluentd): invalid type for io.k8s.apimachinery.pkg.api.resource.Quantity: got "map", expected "string", ValidationError(StatefulSet.spec.template.spec.containers[0].resources.limits.memory fluentd): invalid type for io.k8s.apimachinery.pkg.api.resource.Quantity: got "map", expected "string", ValidationError(StatefulSet.spec.template.spec.containers[0].resources.requests.cpu fluentd): invalid type for io.k8s.apimachinery.pkg.api.resource.Quantity: got "map", expected "string", ValidationError(StatefulSet.spec.template.spec.containers[0].resources.requests.memory fluentd): invalid type for io.k8s.apimachinery.pkg.api.resource.Quantity: got "map", expected "string"]

Rollback

If something goes wrong, or you want to go back to the previous version, you can rollback changes using helm:

helm history collection
helm rollback collection <REVISION-NUMBER>

Non-Helm Users

Breaking Changes

The use of environment variables to set configs has been removed to avoid the extra layer of indirection and confusion. Instead, configs will be set directly within the Fluentd pipeline.
kubernetesMeta and kubernetesMetaReduce have been removed from logs.kubernetes.sumologic.filter.conf of the Fluentd pipeline for the same reason as above (Helm users)
Similarly addStream and addTime (default values were true) have been removed from logs.kubernetes.sumologic.filter.conf of the Fluentd pipeline; the default behavior will remain the same. To preserve the behavior of addStream = false or addTime = false, you can add:

<filter containers.**>
  @type record_modifier
  remove_keys stream,time
</filter>

above the output plugin section here

The Fluentd deployments have been changed to statefulsets to support the use of persistent volumes. This will allow better buffering behavior. They also now include “fluentd” in their names. This is a breaking change for non-Helm users as the deployments will not be cleaned up upon upgrade, leading to duplicate events (logs and metrics will not experience data duplication).
The unified Fluentd statefulsets have been split into set of two different Fluentd's, one for logs and the other one for metrics.
We now support the collection of renamed metrics (for Kubernetes version 1.17+).

How to upgrade for Non-helm Users

1. Tear down existing Fluentd, Prometheus, Fluent Bit and Falco resources

You will need the YAML files you created when you first installed collection. Run the following commands to remove Falco, Fluent-bit, Prometheus Operator and FluentD. You do not need to delete the Namespace and Secret you originally created as they will still be used.

kubectl delete -f falco.yaml
kubectl delete -f fluent-bit.yaml
kubectl delete -f prometheus.yaml
kubectl delete -f fluentd-sumologic.yaml

2. Deploy Fluentd, Fluent Bit and Prometheus again with the version 1.0.0 yaml

Follow the below steps to deploy new resources.

2.1 Deploy Fluentd

Non-Helm users who have made changes to configs in the environment variable sections of the fluentd-sumologic.yaml file will need to move those config changes directly into the Fluentd pipeline.
Run the below command to get the fluentd-sumologic.yaml manifest for version v1.0.0 and then make the changes identified in the above step.

curl https://raw.githubusercontent.com/SumoLogic/sumologic-kubernetes-collection/release-v1.0/deploy/kubernetes/fluentd-sumologic.yaml.tmpl | \
sed 's/\$NAMESPACE'"/<NAMESPACE>/g" | \
sed 's/cluster kubernetes/cluster <CLUSTER_NAME>/g'  >> fluentd-sumologic.yaml

Non-Helm users running a Kubernetes version of 1.13 or older will need to remove the following filter plugin section from their Fluentd pipeline. This is required to prevent data duplication.

<filter prometheus.metrics**> # NOTE: Remove this filter if you are running Kubernetes 1.13 or below.
  @type grep
  <exclude>
    key @metric
    pattern /^apiserver_request_count|^apiserver_request_latencies_summary|^kubelet_runtime_operations_latency_microseconds|^kubelet_docker_operations_latency_microseconds|^kubelet_docker_operations_errors$/
  </exclude>
</filter>

2.2 Deploy Prometheus

Follow steps mentioned here to deploy Prometheus.

2.3: Deploy Fluent Bit

Follow steps mentioned here to deploy Fluent Bit.

Kubernetes App dashboard update

After successful migration please make sure to reinstall your Kubernetes App to the latest version.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1_migration_doc.md

v1_migration_doc.md

Kubernetes Collection 1.0.0 - Breaking Changes

Helm Users

Changes

How to upgrade

1. Upgrade to helm chart version `v0.17.4`

2: Run upgrade script

Troubleshooting Upgrade

Rollback

Non-Helm Users

Breaking Changes

How to upgrade for Non-helm Users

1. Tear down existing Fluentd, Prometheus, Fluent Bit and Falco resources

2. Deploy Fluentd, Fluent Bit and Prometheus again with the version 1.0.0 yaml

2.1 Deploy Fluentd

2.2 Deploy Prometheus

2.3: Deploy Fluent Bit

Kubernetes App dashboard update

Files

v1_migration_doc.md

Latest commit

History

v1_migration_doc.md

File metadata and controls

Kubernetes Collection 1.0.0 - Breaking Changes

Helm Users

Changes

How to upgrade

1. Upgrade to helm chart version v0.17.4

2: Run upgrade script

Troubleshooting Upgrade

Rollback

Non-Helm Users

Breaking Changes

How to upgrade for Non-helm Users

1. Tear down existing Fluentd, Prometheus, Fluent Bit and Falco resources

2. Deploy Fluentd, Fluent Bit and Prometheus again with the version 1.0.0 yaml

2.1 Deploy Fluentd

2.2 Deploy Prometheus

2.3: Deploy Fluent Bit

Kubernetes App dashboard update

1. Upgrade to helm chart version `v0.17.4`