Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: deploy SUC #1770

Merged
merged 8 commits into from
Feb 13, 2025
16 changes: 14 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -199,10 +199,22 @@ task talos:apply-node IP=? MODE=?
# e.g. task talos:apply-node IP=10.10.10.10 MODE=auto
```

### ⬆️ Updating Talos and Kubernetes versions
### ⬆️ Upgrading Talos and Kubernetes versions

#### Method 1: System Upgrade Controller (SUC)

> [!IMPORTANT]
> In order to upgrade make sure `TALOS_VERSION` and `KUBERNETES_VERSION` in `kubernetes/apps/kube-system/system-upgrade/ks.yaml` are set to the versions you wish to upgrade to. Once your cluster receives this configuration the upgrade processes will kick off in the `kube-system` namespace. These versions are under the watch of renovate, which means once the pull requests is merged SUC will attempt to upgrade Kubernetes / Talos and reboot.

Talos and Kubernetes upgrades should be handled via the [rancher/system-upgrade-controller](https://github.com/rancher/system-upgrade-controller) which is deployed in the `kube-system` namespace.

#### Method 2: Taskfile

> [!WARNING]
> Upgrading via this method can interfere with the System Upgrade Controller. SUC could potentially downgrade Talos or Kubernetes versions if care is not taken.
---
> [!IMPORTANT]
> Ensure the `talosVersion` and `kubernetesVersion` in `talconfig.yaml` are up-to-date with the version you wish to upgrade to.
> In order to upgrade make sure `talosVersion` and `kubernetesVersion` in `talconfig.yaml` are set to the versions you wish to upgrade to.

```sh
# Upgrade node to a newer Talos version
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,6 @@ spec:
runAsNonRoot: true
runAsUser: 65534
runAsGroup: 65534
seccompProfile: { type: RuntimeDefault }
service:
app:
controller: echo-server
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ resources:
- ./metrics-server/ks.yaml
- ./reloader/ks.yaml
- ./spegel/ks.yaml
- ./system-upgrade/ks.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
# yaml-language-server: $schema=https://raw.githubusercontent.com/bjw-s/helm-charts/main/charts/other/app-template/schemas/helmrelease-helm-v2.schema.json
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: &app system-upgrade
spec:
interval: 30m
chart:
spec:
chart: app-template
version: 3.7.1
sourceRef:
kind: HelmRepository
name: bjw-s
namespace: flux-system
install:
remediation:
retries: 3
upgrade:
cleanupOnFail: true
remediation:
strategy: rollback
retries: 3
values:
controllers:
system-upgrade:
strategy: RollingUpdate
containers:
app:
image:
repository: docker.io/rancher/system-upgrade-controller
tag: v0.15.0-rc2@sha256:d6faa9cb5123ae14cfbf0e9e22fa5302e1369649a6f1d117874c30a2a8df732b
env:
SYSTEM_UPGRADE_CONTROLLER_NAME: *app
SYSTEM_UPGRADE_CONTROLLER_NAMESPACE:
valueFrom:
fieldRef:
fieldPath: metadata.namespace
SYSTEM_UPGRADE_JOB_BACKOFF_LIMIT: "99"
SYSTEM_UPGRADE_JOB_PRIVILEGED: false
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities: { drop: ["ALL"] }
defaultPodOptions:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: Exists
securityContext:
runAsNonRoot: true
runAsUser: 65534
runAsGroup: 65534
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
serviceAccount:
name: *app
create: true
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
# yaml-language-server: $schema=https://json.schemastore.org/kustomization
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ./helmrelease.yaml
- ./rbac.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system-upgrade
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: system-upgrade
namespace: kube-system
---
apiVersion: talos.dev/v1alpha1
kind: ServiceAccount
metadata:
name: system-upgrade
spec:
roles: ["os:admin"]
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
# yaml-language-server: $schema=https://raw.githubusercontent.com/fluxcd-community/flux2-schemas/main/kustomization-kustomize-v1.json
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: &app system-upgrade
namespace: &namespace kube-system
spec:
commonMetadata:
labels:
app.kubernetes.io/name: *app
interval: 30m
path: ./kubernetes/apps/kube-system/system-upgrade/app
prune: true
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
targetNamespace: *namespace
timeout: 5m
wait: true
---
# yaml-language-server: $schema=https://raw.githubusercontent.com/fluxcd-community/flux2-schemas/main/kustomization-kustomize-v1.json
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: &app system-upgrade-plans
namespace: &namespace kube-system
spec:
commonMetadata:
labels:
app.kubernetes.io/name: *app
dependsOn:
- name: system-upgrade
namespace: kube-system
interval: 30m
path: ./kubernetes/apps/kube-system/system-upgrade/plans
postBuild:
substitute:
# renovate: datasource=docker depName=ghcr.io/siderolabs/installer
TALOS_VERSION: v1.9.3
# renovate: datasource=docker depName=ghcr.io/siderolabs/kubelet
KUBERNETES_VERSION: v1.32.2
prune: true
sourceRef:
kind: GitRepository
name: flux-system
namespace: flux-system
targetNamespace: *namespace
timeout: 5m
wait: false
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
apiVersion: upgrade.cattle.io/v1
kind: Plan
metadata:
name: kubernetes
spec:
version: ${KUBERNETES_VERSION}
concurrency: 1
postCompleteDelay: 30s
exclusive: true
serviceAccountName: system-upgrade
secrets:
- name: system-upgrade
path: /var/run/secrets/talos.dev
ignoreUpdates: true
nodeSelector:
matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: Exists
upgrade:
image: ghcr.io/siderolabs/talosctl:${TALOS_VERSION}
args:
- --nodes=$(SYSTEM_UPGRADE_NODE_NAME)
- upgrade-k8s
- --to=$(SYSTEM_UPGRADE_PLAN_LATEST_VERSION)
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
# yaml-language-server: $schema=https://json.schemastore.org/kustomization
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ./kubernetes.yaml
- ./talos.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
apiVersion: upgrade.cattle.io/v1
kind: Plan
metadata:
name: talos
spec:
version: ${TALOS_VERSION}
concurrency: 1
postCompleteDelay: 2m
exclusive: true
serviceAccountName: system-upgrade
secrets:
- name: system-upgrade
path: /var/run/secrets/talos.dev
ignoreUpdates: true
nodeSelector:
matchExpressions:
- key: kubernetes.io/os
operator: In
values: ["linux"]
upgrade:
image: ghcr.io/jfroy/tnu:0.4.0
args:
- --node=$(SYSTEM_UPGRADE_NODE_NAME)
- --tag=$(SYSTEM_UPGRADE_PLAN_LATEST_VERSION)
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,6 @@ spec:
runAsNonRoot: true
runAsUser: 65534
runAsGroup: 65534
seccompProfile: { type: RuntimeDefault }
service:
app:
controller: cloudflared
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
machine:
features:
kubernetesTalosAPIAccess:
enabled: true
allowedRoles: ["os:admin"]
allowedKubernetesNamespaces: ["kube-system"]