Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-1.25] Backport user-provided CA cert and kubeadm bootstrap token support #6929

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion cmd/cert/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,9 @@ func main() {
app.Commands = []cli.Command{
cmds.NewCertCommand(
cmds.NewCertSubcommands(
cert.Run),
cert.Rotate,
cert.RotateCA,
),
),
}

Expand Down
11 changes: 10 additions & 1 deletion cmd/k3s/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ func main() {
return
}

tokenCommand := internalCLIAction(version.Program+"-"+cmds.TokenCommand, dataDir, os.Args)
etcdsnapshotCommand := internalCLIAction(version.Program+"-"+cmds.EtcdSnapshotCommand, dataDir, os.Args)
secretsencryptCommand := internalCLIAction(version.Program+"-"+cmds.SecretsEncryptCommand, dataDir, os.Args)
certCommand := internalCLIAction(version.Program+"-"+cmds.CertCommand, dataDir, os.Args)
Expand All @@ -51,6 +52,12 @@ func main() {
cmds.NewCRICTL(externalCLIAction("crictl", dataDir)),
cmds.NewCtrCommand(externalCLIAction("ctr", dataDir)),
cmds.NewCheckConfigCommand(externalCLIAction("check-config", dataDir)),
cmds.NewTokenCommands(
tokenCommand,
tokenCommand,
tokenCommand,
tokenCommand,
),
cmds.NewEtcdSnapshotCommand(etcdsnapshotCommand,
cmds.NewEtcdSnapshotSubcommands(
etcdsnapshotCommand,
Expand All @@ -69,7 +76,9 @@ func main() {
),
cmds.NewCertCommand(
cmds.NewCertSubcommands(
certCommand),
certCommand,
certCommand,
),
),
cmds.NewCompletionCommand(internalCLIAction(version.Program+"-completion", dataDir, os.Args)),
}
Expand Down
11 changes: 10 additions & 1 deletion cmd/server/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ import (
"github.com/k3s-io/k3s/pkg/cli/kubectl"
"github.com/k3s-io/k3s/pkg/cli/secretsencrypt"
"github.com/k3s-io/k3s/pkg/cli/server"
"github.com/k3s-io/k3s/pkg/cli/token"
"github.com/k3s-io/k3s/pkg/configfilearg"
"github.com/k3s-io/k3s/pkg/containerd"
ctr2 "github.com/k3s-io/k3s/pkg/ctr"
Expand Down Expand Up @@ -48,6 +49,12 @@ func main() {
cmds.NewKubectlCommand(kubectl.Run),
cmds.NewCRICTL(crictl.Run),
cmds.NewCtrCommand(ctr.Run),
cmds.NewTokenCommands(
token.Create,
token.Delete,
token.Generate,
token.List,
),
cmds.NewEtcdSnapshotCommand(etcdsnapshot.Save,
cmds.NewEtcdSnapshotSubcommands(
etcdsnapshot.Delete,
Expand All @@ -66,7 +73,9 @@ func main() {
),
cmds.NewCertCommand(
cmds.NewCertSubcommands(
cert.Run),
cert.Rotate,
cert.RotateCA,
),
),
cmds.NewCompletionCommand(completion.Run),
}
Expand Down
29 changes: 29 additions & 0 deletions cmd/token/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
package main

import (
"context"
"errors"
"os"

"github.com/k3s-io/k3s/pkg/cli/cmds"
"github.com/k3s-io/k3s/pkg/cli/token"
"github.com/k3s-io/k3s/pkg/configfilearg"
"github.com/sirupsen/logrus"
"github.com/urfave/cli"
)

func main() {
app := cmds.NewApp()
app.Commands = []cli.Command{
cmds.NewTokenCommands(
token.Create,
token.Delete,
token.Generate,
token.List,
),
}

if err := app.Run(configfilearg.MustParse(os.Args)); err != nil && !errors.Is(err, context.Canceled) {
logrus.Fatal(err)
}
}
126 changes: 126 additions & 0 deletions contrib/util/certs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
#!/usr/bin/env bash

# Example K3s CA certificate generation script.
#
# This script will generate files sufficient to bootstrap K3s cluster certificate
# authorities. By default, the script will create the required files under
# /var/lib/rancher/k3s/server/tls, where they will be found and used by K3s during initial
# cluster startup. Note that these files MUST be present before K3s is started the first
# time; certificate data SHOULD NOT be changed once the cluster has been initialized.
#
# The output path may be overridden with the DATA_DIR environment variable.
#
# This script will also auto-generate certificates and keys for both root and intermediate
# certificate authorities if none are found.
# If you have only an existing root CA, provide:
# root-ca.pem
# root-ca.key.
# If you have an existing root and intermediate CA, provide:
# root-ca.pem
# intermediate-ca.pem
# intermediate-ca.key.

set -e

CONFIG="
[v3_ca]
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid:always,issuer:always
basicConstraints=CA:true"
TIMESTAMP=$(date +%s)
PRODUCT="${PRODUCT:-k3s}"
DATA_DIR="${DATA_DIR:-/var/lib/rancher/${PRODUCT}}"

if type -t openssl-3 &>/dev/null; then
OPENSSL=openssl-3
else
OPENSSL=openssl
fi

echo "Using $(which ${OPENSSL}): $(${OPENSSL} version)"

if ! ${OPENSSL} ecparam -help &>/dev/null; then
echo "openssl not found or missing Elliptic Curve (ecparam) support."
exit 1
fi

if ! ${OPENSSL} req -help 2>&1 | grep -q CAkey; then
echo "openssl req missing -CAkey support; please use OpenSSL 3.0.0 or newer"
exit 1
fi

mkdir -p "${DATA_DIR}/server/tls/etcd"
cd "${DATA_DIR}/server/tls"

# Don't overwrite the service account issuer key; we pass the key into both the controller-manager
# and the apiserver instead of passing a cert list into the apiserver, so there's no facility for
# rotation and things will get very angry if all the SA keys are invalidated.
if [[ -e service.key ]]; then
echo "Generating additional Kubernetes service account issuer RSA key"
OLD_SERVICE_KEY="$(cat service.key)"
else
echo "Generating Kubernetes service account issuer RSA key"
fi
${OPENSSL} genrsa -traditional -out service.key 2048
echo "${OLD_SERVICE_KEY}" >> service.key

# Use existing root CA if present
if [[ -e root-ca.pem ]]; then
echo "Using existing root certificate"
else
echo "Generating root certificate authority RSA key and certificate"
${OPENSSL} genrsa -out root-ca.key 4096
${OPENSSL} req -x509 -new -nodes -key root-ca.key -sha256 -days 7300 -out root-ca.pem -subj "/CN=${PRODUCT}-root-ca@${TIMESTAMP}" -config <(echo "${CONFIG}") -extensions v3_ca
fi
cat root-ca.pem > root-ca.crt

# Use existing intermediate CA if present
if [[ -e intermediate-ca.pem ]]; then
echo "Using existing intermediate certificate"
else
if [[ ! -e root-ca.key ]]; then
echo "Cannot generate intermediate certificate without root certificate private key"
exit 1
fi

echo "Generating intermediate certificate authority RSA key and certificate"
${OPENSSL} genrsa -out intermediate-ca.key 4096
${OPENSSL} req -x509 -new -nodes -CAkey root-ca.key -CA root-ca.crt -key intermediate-ca.key -sha256 -days 7300 -out intermediate-ca.pem -subj "/CN=${PRODUCT}-intermediate-ca@${TIMESTAMP}" -config <(echo "${CONFIG}, pathlen:1") -extensions v3_ca
fi
cat intermediate-ca.pem root-ca.pem > intermediate-ca.crt

if [[ ! -e intermediate-ca.key ]]; then
echo "Cannot generate leaf certificates without intermediate certificate private key"
exit 1
fi

# Generate new leaf CAs for all the control-plane and etcd components
echo "Generating Kubernetes server leaf certificate authority EC key and certificate"
${OPENSSL} ecparam -name prime256v1 -genkey -out client-ca.key
${OPENSSL} req -x509 -new -nodes -CAkey intermediate-ca.key -CA intermediate-ca.crt -key client-ca.key -sha256 -days 3650 -out client-ca.pem -subj "/CN=${PRODUCT}-client-ca@${TIMESTAMP}" -config <(echo "${CONFIG}, pathlen:0") -extensions v3_ca
cat client-ca.pem intermediate-ca.pem root-ca.pem > client-ca.crt

echo "Generating Kubernetes client leaf certificate authority EC key and certificate"
${OPENSSL} ecparam -name prime256v1 -genkey -out server-ca.key
${OPENSSL} req -x509 -new -nodes -CAkey intermediate-ca.key -CA intermediate-ca.crt -key server-ca.key -sha256 -days 3650 -out server-ca.pem -subj "/CN=${PRODUCT}-server-ca@${TIMESTAMP}" -config <(echo "${CONFIG}, pathlen:0") -extensions v3_ca
cat server-ca.pem intermediate-ca.pem root-ca.pem > server-ca.crt

echo "Generating Kubernetes request-header leaf certificate authority EC key and certificate"
${OPENSSL} ecparam -name prime256v1 -genkey -out request-header-ca.key
${OPENSSL} req -x509 -new -nodes -CAkey intermediate-ca.key -CA intermediate-ca.crt -key request-header-ca.key -sha256 -days 3560 -out request-header-ca.pem -subj "/CN=${PRODUCT}-request-header-ca@${TIMESTAMP}" -config <(echo "${CONFIG}, pathlen:0") -extensions v3_ca
cat request-header-ca.pem intermediate-ca.pem root-ca.pem > request-header-ca.crt

echo "Generating etcd peer leaf certificate authority EC key and certificate"
${OPENSSL} ecparam -name prime256v1 -genkey -out etcd/peer-ca.key
${OPENSSL} req -x509 -new -nodes -CAkey intermediate-ca.key -CA intermediate-ca.crt -key etcd/peer-ca.key -sha256 -days 3650 -out etcd/peer-ca.pem -subj "/CN=etcd-peer-ca@${TIMESTAMP}" -config <(echo "${CONFIG}, pathlen:0") -extensions v3_ca
cat etcd/peer-ca.pem intermediate-ca.pem root-ca.pem > etcd/peer-ca.crt

echo "Generating etcd server leaf certificate authority EC key and certificate"
${OPENSSL} ecparam -name prime256v1 -genkey -out etcd/server-ca.key
${OPENSSL} req -x509 -new -nodes -CAkey intermediate-ca.key -CA intermediate-ca.crt -key etcd/server-ca.key -sha256 -days 3650 -out etcd/server-ca.pem -subj "/CN=etcd-server-ca@${TIMESTAMP}" -config <(echo "${CONFIG}, pathlen:0") -extensions v3_ca
cat etcd/server-ca.pem intermediate-ca.pem root-ca.pem > etcd/server-ca.crt

echo
echo "CA certificate generation complete. Required files are now present in: ${DATA_DIR}/server/tls"
echo "For security purposes, you should make a secure copy of the following files and remove them from cluster members:"
ls ${DATA_DIR}/server/tls/root-ca.* ${DATA_DIR}/server/tls/intermediate-ca.* | xargs -n1 echo -e "\t"
64 changes: 64 additions & 0 deletions docs/adrs/agent-join-token.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Support `kubeadm`-style Bootstrap Token Secrets

Date: 2022-12-20

## Status

Accepted

## Context

### K3s Token Types and Use

K3s currently supports two tokens that can be used to join nodes to the cluster:
* `--token`: This is the default token, and a random value is generated during initial cluster startup if not
specified by the user. This token is also used as the passphrase input to the PBKDF2 function used to generate
the encryption key for cluster bootstrap data. For this reason, all server nodes must use the same token value
once the cluster has been started, and the token value cannot be changed.
* `--agent-token`: By default, this is set to the same as the `--token` value. If set, this token can be used
to join new agents to the cluster, but not servers. This token value can be changed after the cluster has
beens started, but doing so requires coordinatating reconfiguration and restart of all of the servers in the
cluster.

Internally, these tokens are used as the password for HTTP Basic authentication to the K3s supervisor when the
agent bootstraps its configuration and certificates. Servers use a username of `server`, while agents
(including servers local agents) use `node`. Once nodes join the cluster they also populate a node password
secret that prevents other nodes from using the same node name, but this is unrelated to the token.

### Security Considerations

Users have requested the ability to generate single-use or limited-duration tokens that can be used to join
nodes to the cluster, but can be deleted or automatically expire in order to reduce the impact should the
token be compromised. Currently, compromise of the server token would require a complete rebuild of the
cluster in order to use a new token. Compromise of the agent token would require a coordinated restart of all
nodes in the cluster.

### Existing Work

`kubeadm` includes a `kubeadm token create` command that creates secrets of type
`bootstrap.kubernetes.io/token`, which is a core upstream type that is not restricted for use by kubeadm.

There are helpers for interacting with bootstrap token secrets in the `k8s.io/cluster-bootstrap` package, and
upstream Kubernetes includes two controllers (`tokencleaner` and `bootstrapsigner`) to support use of cluster
bootstrap secrets. The latter controller is not relevant for our use case, as it serves the same function as
existing K3s supervisor routes - making initial cluster CA certificates and a client kubeconfig available for
bootstrapping nodes. The [boostrap-tokens](https://kubernetes.io/docs/reference/access-authn-authz/bootstrap-tokens/)
documentation can be referenced for more information.

## Decision

* K3s will allow joining agents to the cluster using bootstrap token secrets.
* K3s will NOT allow joining servers to the cluster using bootstrap token secrets.
* K3s will include a `k3s token` subcommand that allows for token create/list/delete operations, similar to
the the functionality offered by `kubeadm`.
* K3s will enable the `tokencleaner` controller, in order to ensure that bootstrap token secrets are cleaned
up when their TTL expires.
* K3s agent bootstrap functionality will allow a agent to connect the cluster using existing [Node
Authorization](https://kubernetes.io/docs/reference/access-authn-authz/node/) to authenticate to the
cluster during startup, even after its join token has been invalidated.
* K3s agent bootstrap functionality will NOT allow an agent to connect to the cluster if it does not have a valid
token, and its Node object has been deleted from the cluster.

## Consequences

This will require additional documentation, CLI subcommands, and QA work to validate use of bootstrap token secret auth.
89 changes: 89 additions & 0 deletions docs/adrs/ca-cert-rotation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Support CA Certificate Renewal / Rotation, Signing by External Root

Date: 2022-12-19

## Status

Accepted

## Context

On the first startup of a new cluster, K3s currently autogenerates a number of self-signed cluster CAs and keys:
* Cluster Server CA + Key (used to sign server certificates)
* Cluster Client CA + Key (used to sign client certificates)
* Request Header CA + Key (used to sign certificates for apiserver aggregation)
* etcd Peer CA + Key (used to sign certificates for authentication between etcd peer servers)
* etcd Client CA + Key (used to sign certificates for etcd clients, ie the apiserver)
* ServiceAccount Token Signing Key (used to sign ServiceAccount JSON Web Tokens)

These CAs are all self-signed, without any cross-signing or common root or intermediates, and are valid for 10
years. When any of these certs expire, any certificates issued will be invalid, causing a significant outage
to the cluster.

### Server CA Pinning

The Cluster Server CA is used in node bootstrapping. The full `K10` format token includes a SHA265 sum of the
Cluster Server CA file's on-disk PEM representation. Nodes that join the cluster using a full token perform a
set of checks when starting up:
1. Download the cluster server CA bundle from `/v1-k3s/cacert` on the server they are joining.
2. Validate that the hash of the bytes in the CA bundle match the hash string following the `K10` prefix in the
token.
3. Validate that the certificate presented by the server they are joining can be validated using the roots and
intermediates present in the CA bundle.

Realistically, this hash should have instead been derived from the DER encoding of the root certificate in
that bundle, as PEM format allows for variable padding, line lengths, and so on. Only DER format is guaranteed
to be stable, and hashing only the root of the chain would have allowed for rotating or renewing intermediate
CAs without breaking trust between cluster nodes.

### Bootstrap Data Immutability

There is not currently any way to write new certificates to the datastore. The certificates and keys are
written to disk once on initial startup, and from there written to the cluster datastore. From that point on,
the files in the datastore are considered authoritative; replacing the files on disk will result in either
replacement, or error, depending on whether or not the files on disk are newer than those in the datastore.

The `secrets-encrypt` subcommand does currently mutate the bootstrap data, but it only touches the secrets
encryption configuration, not the CA certs or keys.

### Summary

For both of the above reasons (hash pinning, and lack of rewriteability) it is not currently possible to
renew or replace the cluster CA certs or keys.

### Additional Considerations

#### Aggressive Certificate Rotation

Some users (particularly government or financial customers) attempt to implement the guidance from [NIST SP 800-57
Part 1 Rev. 5](https://csrc.nist.gov/publications/detail/sp/800-57-part-1/rev-5/final). This document would
see users signing cluster CAs with a set of organizational root and intermediate certificates, and rotating
both the intermediate and cluster CA certificates and keys on at least a yearly basis.

#### ServiceAccount Signing Key Rolling Replacement

While the ServiceAccount signing key is not signed by any CA, rotation of the key must be done carefully so
as to avoid causing an outage. The apiserver and controller-manager must be updated to use a new key, while
still trusting the old key for a period of time. The old key can then be removed at a later date, once all
clients using tokens signed by the old key have received new tokens.

## Decision

* K3s will allow for use of CA certificates signed by an arbitrary set of external root/intermediate CAs.
* K3s will allow for non-disruptive[^1] renewal or replacement of the CA certificates and keys, if the cluster was
originally started using user-provided certificates signed by an external CA.
* K3s will allow for disruptive[^2] renewal or replacement of cluster CA certificates and keys, if the cluster was
originally started with autogenerated self-signed CAs.
* K3s will provide example tooling to allow users to generate cluster CA certificates and keys prior to initial
cluster startup, and provide tooling and process documentation to update the bootstrap data and prepare agents
to trust the new certificates (if necessary)

[^1]: Non-disruptive renewal requires no change to node configuration. The service only needs to be restarted.
[^2]: Disruptive renewal requires changes to the K3s CLI flags, configuration file, or environment variables
prior to restarting the service. Additionally, the cluster may experience a temporary outage while the
configuration change has been affected to all nodes, due to cluster nodes temporary not sharing a common
root of trust.

## Consequences

This will require additional documentation, CLI subcommands, and QA work to validate the process steps.
Loading