You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What happened:
A specific gardener e2e kind test is failing often - Shoot Tests Hibernated Shoot [It] Create, Migrate and Delete [Shoot, control-plane-migration, hibernated]
Creation, Migration and hibernation steps succeed. To do the deletion of the migrated shoot which is currently hibernated, you need to wake up the etcd-cluster. At this stage the etcd cluster is not getting ready.
In one such occurrence we see the following logs in etcd-events-2 (backup-restore container):
2025-02-17T12:45:52.969873914Z stderr F 2025-02-17 12:45:52.968607 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:52.970531124Z stderr F 2025-02-17 12:45:52.970317 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:53.055124837Z stderr F 2025-02-17 12:45:53.054945 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:53.062374513Z stderr F 2025-02-17 12:45:53.062106 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:53.153435731Z stderr F 2025-02-17 12:45:53.153314 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:53.160917167Z stderr F 2025-02-17 12:45:53.160807 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:53.251792044Z stderr F 2025-02-17 12:45:53.251680 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
2025-02-17T12:45:53.264667024Z stderr F 2025-02-17 12:45:53.264552 E | rafthttp: request sent was ignored (cluster ID mismatch: peer[6fdaf30df04c0245]=4ffa550a92b87675, local=39b1e34c77b1db7a)
You would typically see cluster ID mismatch in the 3 scenarios that are documented here.
Prior to starting the embedded etcd process, initialization is triggered by etcd-wrapper. Once the initialization succeeds, etcd-wrapper requests for etcd config. etcd-backup-restore computes the etcd config here. One of the key parameters in the etcd config is to determine the initial-cluster-state which is done here to distinguish if this member bootstraps/joins a new cluster or joins an existing cluster.
If member list API call fails (see IsLearnerPresent) due to any reason then this function correctly returns an error which is swallowed by the calling function (see here) and the calling function assumes initial-cluster-state=new. This is done for 0->3 replicas bootstrap case because while bootstrapping a new cluster etcd Member API calls will never succeed. Even in case of errors, we have to serve the config with initial-cluster-state=new to let the bootstrap succeed.
However, the above code-flow has a negative consequence as well. Consider the following case:
Data directory of one of the etcd member gets corrupted while bringing up the cluster from 0->3.
Etcd-backup-restore validates the data directory and finds it corrupt. It will trigger the single member restoration (see this for more information).
As part of single-member-restoration, it will add this member as a learner after which it will trigger the initialization. Once initialization is successful, it will serve an etcd config.
While computing the initial-cluster-state if there is an error while making the etcd Member API call (due to transient quorum loss - possible due to VPA eviction etc.) then it assumes initial-cluster-state as new. This will cause Cluster ID mismatch as this state for a learner as it's not the correct inital-cluster state.
This will force this member to create a new member ID which will never match with the member IDs that are known by other 2 members of the etcd cluster. Once it dials the other 2 members then they will reject the call with the Cluster ID mismatch response.
What you expected to happen: initial-cluster-state should always be computed correctly.
The text was updated successfully, but these errors were encountered:
unmarshall
changed the title
Incorrect computation of initial-cluster-state
Incorrect computation of initial-cluster-state which can lead to cluster ID mismatch errors
Feb 21, 2025
unmarshall
changed the title
Incorrect computation of initial-cluster-state which can lead to cluster ID mismatch errors
Incorrect computation of initial-cluster-state during single member restoration which can lead to cluster ID mismatch errors
Feb 21, 2025
To reproduce this issue locally please follow these steps:
Start an etcd cluster with 3 members.
Remove one of the etcd member from the cluster using API call: etcdctl member remove <memberID>
Add a learner/member to same etcd cluster.
Start a leaner/member but with initial-cluster-state set to new instead of existing.
you will see such logs:
{"level":"warn","ts":"2025-02-21T14:52:29.42254+0530","caller":"rafthttp/stream.go:653","msg":"request sent was ignored by remote peer due to cluster ID mismatch","remote-peer-id":"3e9662d4914e445d","remote-peer-cluster-id":"7fa825e3d560ad6f","local-member-id":"91bc3c398fb3c146","local-member-cluster-id":"6e9fdbc6edbe620","error":"cluster ID mismatch"}
{"level":"warn","ts":"2025-02-21T14:52:29.495654+0530","caller":"rafthttp/stream.go:653","msg":"request sent was ignored by remote peer due to cluster ID mismatch","remote-peer-id":"3e9662d4914e445d","remote-peer-cluster-id":"7fa825e3d560ad6f","local-member-id":"91bc3c398fb3c146","local-member-cluster-id":"6e9fdbc6edbe620","error":"cluster ID mismatch"}
{"level":"warn","ts":"2025-02-21T14:52:29.495668+0530","caller":"rafthttp/stream.go:653","msg":"request sent was ignored by remote peer due to cluster ID mismatch","remote-peer-id":"3e9662d4914e445d","remote-peer-cluster-id":"7fa825e3d560ad6f","local-member-id":"91bc3c398fb3c146","local-member-cluster-id":"6e9fdbc6edbe620","error":"cluster ID mismatch"}
How to categorize this issue?
/area control-plane
/kind bug
What happened:
A specific gardener e2e kind test is failing often -
Shoot Tests Hibernated Shoot [It] Create, Migrate and Delete [Shoot, control-plane-migration, hibernated]
Creation, Migration and hibernation steps succeed. To do the deletion of the migrated shoot which is currently hibernated, you need to wake up the etcd-cluster. At this stage the etcd cluster is not getting ready.
In one such occurrence we see the following logs in etcd-events-2 (backup-restore container):
For complete logs see: etcd-events-2-backup-restore.log
You would typically see
cluster ID mismatch
in the 3 scenarios that are documented here.Prior to starting the embedded etcd process, initialization is triggered by etcd-wrapper. Once the initialization succeeds, etcd-wrapper requests for etcd config. etcd-backup-restore computes the etcd config here. One of the key parameters in the etcd config is to determine the
initial-cluster-state
which is done here to distinguish if this member bootstraps/joins a new cluster or joins an existing cluster.If member list API call fails (see IsLearnerPresent) due to any reason then this function correctly returns an error which is swallowed by the calling function (see here) and the calling function assumes
initial-cluster-state=new
. This is done for0->3 replicas
bootstrap case because while bootstrapping a new cluster etcd Member API calls will never succeed. Even in case of errors, we have to serve the config withinitial-cluster-state=new
to let the bootstrap succeed.However, the above code-flow has a negative consequence as well. Consider the following case:
0->3
.initial-cluster-state
if there is an error while making the etcd Member API call (due to transient quorum loss - possible due to VPA eviction etc.) then it assumesinitial-cluster-state
asnew
. This will causeCluster ID mismatch
as this state for alearner
as it's not the correct inital-cluster state.Cluster ID mismatch
response.What you expected to happen:
initial-cluster-state
should always be computed correctly.The text was updated successfully, but these errors were encountered: