Skip to content

Commit

Permalink
Detect SA token rotation
Browse files Browse the repository at this point in the history
Fixes linkerd/linkerd2#12573

## Problem

When deployed, the linkerd-cni pod gets its service account token mounted automatically by k8s:
```yaml
  - name: kube-api-access-729gv
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
```
According to this, the token is set to expire after an hour.
When the linkerd-cni pod starts it deploys the file `ZZZ-linkerd-cni-kubeconfig` in to the **host** file system.
That config contains the token sourced from `/var/run/secrets/kubernetes.io/serviceaccount` (mounted by the pod).
When the token gets rotated after an hour, that token file is updated but `ZZZ-linkerd-cni-kubeconfig` is not updated.
The `linkerd-cni` binary uses that token to connect to the kube-api, so having an outdated token should forbid it from functioning properly, which would manifest as new pods in the data plane not being able to acquire a proper network config.
However, that failure isn't usually observed, except for the cases pointed out in linkerd/linkerd2#12573. The reason is that the token's actual lifetime is one year, due to kube-api's `--service-account-extend-token-expiration` [flag](https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/#options) which is usually set as `true` to avoid breaking too many instances not yet adapted to use tokens with short expirations:

> Turns on projected service account expiration extension during token generation, which helps safe transition from legacy token to bound service account token feature. If this flag is enabled, admission injected tokens would be extended up to 1 year to prevent unexpected failure during transition, ignoring value of service-account-max-token-expiration.

## Repro

### AKS

The issue currently affects AKS clusters using OIDC keys. To reproduce, create a new cluster in AKS, making sure "Enable OIDC" and "Workload Identity" is ticked in the UI.

Then install the linkerd-cni plugin, labelling the linkerd-cni DaemonSet so that its ServiceAccount token is provided via OIDC:
```
linkerd install-cni --set-string "podLabels.azure\.workload\.identity/use"="true" | kubectl apply -f -
```

And install linkerd with cni enabled, and an injected instance of emojivoto.

The secret token is rotated after an hour, but the old one remains valid for a 24h. Manually rotating the key as detailed in the [docs](https://learn.microsoft.com/en-us/azure/aks/use-oidc-issuer#rotate-the-oidc-key) should invalidate the old key.

After that, bouncing any emojivoto pod will prove unsuccessful with the following event being raised:

```
Warning  FailedCreatePodSandBox  15s   kubelet            Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "8121291446642b272cea9ee5f083958a37bab0dd7060c4d9c06bb05fecf911d2": plugin type="linkerd-cni" name="linkerd-cni" failed (add): Unauthorized
```

## Fix

This change adds a new function `monitor_service_account_token()` that monitors the rollout of the token file; which is a symlink whose target changes as a new token is deployed. When detecting a new token file, this function calls the new `create_kubeconfig()` function.

This change also removes the existing logic around the DELETE event, which is a leftover from previous changes and is now a no-op.

Also, as detailed in linkerd/linkerd2#13407, the ServiceAccount token has been removed from the cni config template because it's not used, simplifying things as we can regenerate the kubeconfig file without having to touch the cni config file.

Finally, the file `linkerd-cni.conf.default` has been removed as is not used.

## Test

Same as with the repro above, but use the cni-plugin image that contains the fix:

```
linkerd install-cni --set-string "podLabels.azure\.workload\.identity/use"="true" --set image.name="ghcr.io/alpeb/cni-plugin" --set image.version="v1.5.3" | kubectl apply -f -
```

After an hour when the token gets rotated you should see the event in the linkerd-cni pod logs.
  • Loading branch information
alpeb committed Nov 29, 2024
1 parent 30889be commit 765591c
Show file tree
Hide file tree
Showing 3 changed files with 75 additions and 101 deletions.
1 change: 0 additions & 1 deletion Dockerfile-cni-plugin
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,6 @@ COPY --from=go /go/bin/linkerd-cni /opt/cni/bin/
COPY --from=cni-repair-controller /build/linkerd-cni-repair-controller /usr/lib/linkerd/
COPY LICENSE .
COPY cni-plugin/deployment/scripts/install-cni.sh .
COPY cni-plugin/deployment/linkerd-cni.conf.default .
COPY cni-plugin/deployment/scripts/filter.jq .
ENV PATH=/linkerd:/opt/cni/bin:$PATH
CMD ["install-cni.sh"]
24 changes: 0 additions & 24 deletions cni-plugin/deployment/linkerd-cni.conf.default

This file was deleted.

151 changes: 75 additions & 76 deletions cni-plugin/deployment/scripts/install-cni.sh
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ HOST_CNI_NET="${CONTAINER_MOUNT_PREFIX}${DEST_CNI_NET_DIR}"
# Location of legacy "interface mode" file, to be automatically deleted
DEFAULT_CNI_CONF_PATH="${HOST_CNI_NET}/01-linkerd-cni.conf"
KUBECONFIG_FILE_NAME=${KUBECONFIG_FILE_NAME:-ZZZ-linkerd-cni-kubeconfig}
SERVICEACCOUNT_PATH=/var/run/secrets/kubernetes.io/serviceaccount

############################
### Function definitions ###
Expand Down Expand Up @@ -119,56 +120,32 @@ install_cni_bin() {
log "Wrote linkerd CNI binaries to ${dir}"
}

create_cni_conf() {
# Create temp configuration and kubeconfig files
#
TMP_CONF='/tmp/linkerd-cni.conf.default'
# If specified, overwrite the network configuration file.
CNI_NETWORK_CONFIG_FILE="${CNI_NETWORK_CONFIG_FILE:-}"
CNI_NETWORK_CONFIG="${CNI_NETWORK_CONFIG:-}"
create_kubeconfig() {
KUBE_CA_FILE=${KUBE_CA_FILE:-${SERVICEACCOUNT_PATH}/ca.crt}
SKIP_TLS_VERIFY=${SKIP_TLS_VERIFY:-false}
SERVICEACCOUNT_TOKEN=$(cat ${SERVICEACCOUNT_PATH}/token)

# If the CNI Network Config has been overwritten, then use template from file
if [ -e "${CNI_NETWORK_CONFIG_FILE}" ]; then
log "Using CNI config template from ${CNI_NETWORK_CONFIG_FILE}."
cp "${CNI_NETWORK_CONFIG_FILE}" "${TMP_CONF}"
elif [ "${CNI_NETWORK_CONFIG}" ]; then
log 'Using CNI config template from CNI_NETWORK_CONFIG environment variable.'
cat >"${TMP_CONF}" <<EOF
${CNI_NETWORK_CONFIG}
EOF
# Check if we're not running as a k8s pod.
if [[ ! -f "${SERVICEACCOUNT_PATH}/token" ]]; then
return
fi

SERVICE_ACCOUNT_PATH=/var/run/secrets/kubernetes.io/serviceaccount
KUBE_CA_FILE=${KUBE_CA_FILE:-${SERVICE_ACCOUNT_PATH}/ca.crt}
SKIP_TLS_VERIFY=${SKIP_TLS_VERIFY:-false}
# Pull out service account token.
SERVICEACCOUNT_TOKEN=$(cat ${SERVICE_ACCOUNT_PATH}/token)

# Check if we're running as a k8s pod.
# The check will assert whether token exists and is a regular file
if [ -f "${SERVICE_ACCOUNT_PATH}/token" ]; then
# We're running as a k8d pod - expect some variables.
# If the variables are null, exit
if [ -z "${KUBERNETES_SERVICE_HOST}" ]; then
log 'KUBERNETES_SERVICE_HOST not set'; exit 1;
fi
if [ -z "${KUBERNETES_SERVICE_PORT}" ]; then
log 'KUBERNETES_SERVICE_PORT not set'; exit 1;
fi
if [ -z "${KUBERNETES_SERVICE_HOST}" ]; then
log 'KUBERNETES_SERVICE_HOST not set'; exit 1;
fi
if [ -z "${KUBERNETES_SERVICE_PORT}" ]; then
log 'KUBERNETES_SERVICE_PORT not set'; exit 1;
fi

if [ "${SKIP_TLS_VERIFY}" = 'true' ]; then
TLS_CFG='insecure-skip-tls-verify: true'
elif [ -f "${KUBE_CA_FILE}" ]; then
TLS_CFG="certificate-authority-data: $(base64 "${KUBE_CA_FILE}" | tr -d '\n')"
fi
if [ "${SKIP_TLS_VERIFY}" = 'true' ]; then
TLS_CFG='insecure-skip-tls-verify: true'
elif [ -f "${KUBE_CA_FILE}" ]; then
TLS_CFG="certificate-authority-data: $(base64 "${KUBE_CA_FILE}" | tr -d '\n')"
fi

# Write a kubeconfig file for the CNI plugin. Do this
# to skip TLS verification for now. We should eventually support
# writing more complete kubeconfig files. This is only used
# if the provided CNI network config references it.
touch "${CONTAINER_MOUNT_PREFIX}${DEST_CNI_NET_DIR}/${KUBECONFIG_FILE_NAME}"
chmod "${KUBECONFIG_MODE:-600}" "${CONTAINER_MOUNT_PREFIX}${DEST_CNI_NET_DIR}/${KUBECONFIG_FILE_NAME}"
cat > "${CONTAINER_MOUNT_PREFIX}${DEST_CNI_NET_DIR}/${KUBECONFIG_FILE_NAME}" <<EOF
touch "${CONTAINER_MOUNT_PREFIX}${DEST_CNI_NET_DIR}/${KUBECONFIG_FILE_NAME}"
chmod "${KUBECONFIG_MODE:-600}" "${CONTAINER_MOUNT_PREFIX}${DEST_CNI_NET_DIR}/${KUBECONFIG_FILE_NAME}"
cat > "${CONTAINER_MOUNT_PREFIX}${DEST_CNI_NET_DIR}/${KUBECONFIG_FILE_NAME}" <<EOF
# Kubeconfig file for linkerd CNI plugin.
apiVersion: v1
kind: Config
Expand All @@ -188,31 +165,36 @@ contexts:
user: linkerd-cni
current-context: linkerd-cni-context
EOF
}

fi
create_cni_conf() {
# Create temp configuration and kubeconfig files
#
TMP_CONF='/tmp/linkerd-cni.conf.default'
# If specified, overwrite the network configuration file.
CNI_NETWORK_CONFIG_FILE="${CNI_NETWORK_CONFIG_FILE:-}"
CNI_NETWORK_CONFIG="${CNI_NETWORK_CONFIG:-}"

# Insert any of the supported "auto" parameters.
grep '__KUBERNETES_SERVICE_HOST__' ${TMP_CONF} && sed -i s/__KUBERNETES_SERVICE_HOST__/"${KUBERNETES_SERVICE_HOST}"/g ${TMP_CONF}
grep '__KUBERNETES_SERVICE_PORT__' ${TMP_CONF} && sed -i s/__KUBERNETES_SERVICE_PORT__/"${KUBERNETES_SERVICE_PORT}"/g ${TMP_CONF}
# Check in container
sed -i s/__KUBERNETES_NODE_NAME__/"${KUBERNETES_NODE_NAME:-$(hostname)}"/g ${TMP_CONF}
sed -i s/__KUBECONFIG_FILENAME__/"${KUBECONFIG_FILE_NAME}"/g ${TMP_CONF}
sed -i s/__CNI_MTU__/"${CNI_MTU:-1500}"/g ${TMP_CONF}
# If the CNI Network Config has been overwritten, then use template from file
if [ -e "${CNI_NETWORK_CONFIG_FILE}" ]; then
log "Using CNI config template from ${CNI_NETWORK_CONFIG_FILE}."
cp "${CNI_NETWORK_CONFIG_FILE}" "${TMP_CONF}"
elif [ "${CNI_NETWORK_CONFIG}" ]; then
log 'Using CNI config template from CNI_NETWORK_CONFIG environment variable.'
cat >"${TMP_CONF}" <<EOF
${CNI_NETWORK_CONFIG}
EOF
fi

# Use alternative command character "~", since these include a "/".
sed -i s~__KUBECONFIG_FILEPATH__~"${DEST_CNI_NET_DIR}/${KUBECONFIG_FILE_NAME}"~g ${TMP_CONF}

# Log the config file before inserting service account token.
# This way auth token is not visible in the logs.
log "CNI config: $(cat ${TMP_CONF})"

sed -i s/__SERVICEACCOUNT_TOKEN__/"${SERVICEACCOUNT_TOKEN:-}"/g ${TMP_CONF}
}

install_cni_conf() {
local cni_conf_path=$1

create_cni_conf

local tmp_data=''
local conf_data=''
if [ -e "${cni_conf_path}" ]; then
Expand Down Expand Up @@ -257,14 +239,7 @@ sync() {

local config_file_count
local new_sha
if [ "$ev" = 'DELETE' ]; then
# When the event type is 'DELETE', we check to see if there are any `*conf` or `*conflist`
# files on the host's filesystem.
config_file_count=$(find "${HOST_CNI_NET}" -maxdepth 1 -type f \( -iname '*conflist' -o -iname '*conf' \) | sort | wc -l)
if [ "$config_file_count" -eq 0 ]; then
log "No active CNI configuration file found after $ev event"
fi
elif [ "$ev" = 'CREATE' ] || [ "$ev" = 'MOVED_TO' ] || [ "$ev" = 'MODIFY' ]; then
if [ "$ev" = 'CREATE' ] || [ "$ev" = 'MOVED_TO' ] || [ "$ev" = 'MODIFY' ]; then
# When the event type is 'CREATE', 'MOVED_TO' or 'MODIFY', we check the
# previously observed SHA (updated with each file watch) and compare it
# against the new file's SHA. If they differ, it means something has
Expand All @@ -273,7 +248,9 @@ sync() {
if [ "$new_sha" != "$prev_sha" ]; then
# Create but don't rm old one since we don't know if this will be configured
# to run as _the_ cni plugin.
log "New file [$filename] detected; re-installing"
log "New/changed file [$filename] detected; re-installing"
create_kubeconfig
create_cni_conf
install_cni_conf "$filepath"
else
# If the SHA hasn't changed or we get an unrecognised event, ignore it.
Expand All @@ -285,22 +262,40 @@ sync() {
fi
}
# Monitor will start a watch on host's CNI config directory
monitor() {
inotifywait -m "${HOST_CNI_NET}" -e create,delete,moved_to,modify |
# monitor_cni_config starts a watch on the host's CNI config directory
monitor_cni_config() {
inotifywait -m "${HOST_CNI_NET}" -e create,moved_to,modify |
while read -r directory action filename; do
if [[ "$filename" =~ .*.(conflist|conf)$ ]]; then
log "Detected change in $directory: $action $filename"
sync "$filename" "$action" "$cni_conf_sha"
# When file exists (i.e we didn't deal with a DELETE ev)
# then calculate its sha to be used the next turn.
if [[ -e "$directory/$filename" && "$action" != 'DELETE' ]]; then
# calculate file SHA to use in the next iteration
if [[ -e "$directory/$filename" ]]; then
cni_conf_sha="$(sha256sum "$directory/$filename" | while read -r s _; do echo "$s"; done)"
fi
fi
done
}
# Kubernetes rolls out serviceaccount tokens by creating new directories
# containing a new token file and re-creating the
# /var/run/secrets/kubernetes.io/serviceaccount/token symlink pointing to it.
# This function listens to creation events under the serviceaccount directory,
# only reacting to direct creation of a "token" file, or creation of
# directories containing a "token" file.
monitor_service_account_token() {
inotifywait -m "${SERVICEACCOUNT_PATH}" -e create |
while read -r directory _ filename; do
target=$(realpath "$directory/$filename")
if [[ (-f "$target" && "${target##*/}" == "token") || (-d "$target" && -e "$target/token") ]]; then
log "Detected creation of file in $directory: $filename; recreating kubeconfig file"
create_kubeconfig
else
log "Detected creation of file in $directory: $filename; ignoring"
fi
done
}
log() {
printf '[%s] %s\n' "$(date '+%Y-%m-%d %H:%M:%S')" "$1"
}
Expand All @@ -327,6 +322,8 @@ else
find "${HOST_CNI_NET}" -maxdepth 1 -type f \( -iname '*conflist' -o -iname '*conf' \) -print0 |
while read -r -d $'\0' file; do
log "Installing CNI configuration for $file"
create_kubeconfig
create_cni_conf
install_cni_conf "$file"
done
fi
Expand All @@ -349,5 +346,7 @@ fi
# builtin, the reception of a signal for which a trap has been set will cause
# the wait builtin to return immediately with an exit status greater than 128,
# immediately after which the trap is executed."
monitor &
wait $!
monitor_cni_config &
monitor_service_account_token &
# uses -n so that we exit when the first background job exits (when there's an error)
wait -n

0 comments on commit 765591c

Please sign in to comment.