2nd etcd node not communicating with 1st etcd node: cluster ID mismatch error #13453

prakashmirji · 2021-10-29T13:11:36Z

Etcd version: 3.5.0
Platform: SLES 15 SP2
Deployed as systemd

configuration:

systemctl cat etcd
# /etc/systemd/system/etcd.service
[Unit]
Description=etcd
Documentation=https://github.com/coreos
[Service]
Restart=on-failure
RestartSec=5
LimitNOFILE=40000
TimeoutStartSec=0
EnvironmentFile=/opt/ezkube/bootstrap/systemd/10-etcd.env
ExecStart=/usr/bin/etcd \
  --advertise-client-urls=https://${INTERNAL_IP}:${ETCD_PORT} \
  --cert-file=/etc/kubernetes/pki/etcd/server.crt \
  --client-cert-auth=true \
  --data-dir=/var/lib/etcd \
  --initial-advertise-peer-urls=https://${INTERNAL_IP}:${ETCD_PEER_PORT} \
  --initial-cluster=${INITIAL_CLUSTER} \
  --key-file=/etc/kubernetes/pki/etcd/server.key \
  --listen-client-urls=https://127.0.0.1:${ETCD_PORT},https://${INTERNAL_IP}:${ETCD_PORT} \
  --listen-peer-urls=https://${INTERNAL_IP}:${ETCD_PEER_PORT} \
  --name=${NAME} \
  --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt \
  --peer-client-cert-auth=true \
  --peer-key-file=/etc/kubernetes/pki/etcd/peer.key \
  --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt \
  --snapshot-count=${SNAPSHOT_COUNT} \
  --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt

[Install]
WantedBy=multi-user.target

etcd log

we see a lot of these log messages

-- Logs begin at Wed 2021-10-27 07:20:37 PDT, end at Fri 2021-10-29 04:30:52 PDT. --
Oct 28 23:14:28 etcdnod1.net etcd[26374]: {"level":"warn","ts":"2021-10-28T23:14:28.128-0700","caller":"raf
thttp/stream.go:653","msg":"request sent was ignored by remote peer due to cluster ID mismatch","remote-peer-id":"abc63b82495af4b1"
,"remote-peer-cluster-id":"8c300cb900906703","local-member-id":"840c5a8fcf5a4b8e","local-member-cluster-id":"6599178285423ae9","err
or":"cluster ID mismatch"}
Oct 28 23:14:28 etcdnod1.net etcd[26374]: {"level":"warn","ts":"2021-10-28T23:14:28.231-0700","caller":"raf
thttp/stream.go:653","msg":"request sent was ignored by remote peer due to cluster ID mismatch","remote-peer-id":"abc63b82495af4b1"
,"remote-peer-cluster-id":"8c300cb900906703","local-member-id":"840c5a8fcf5a4b8e","local-member-cluster-id":"6599178285423ae9","err
or":"cluster ID mismatch"}

output of : systemctl status etcd

● etcd.service - etcd
   Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2021-10-29 02:27:12 PDT; 2h 8min ago
     Docs: https://github.com/coreos
 Main PID: 29931 (etcd)
    Tasks: 7
   CGroup: /system.slice/etcd.service
           └─29931 /usr/bin/etcd --advertise-client-urls=https://16.0.14.118:2379 --cert-file=/etc/kubernetes/pki/etcd/server.crt --cli
ent-cert-auth=true --data-dir=/var/lib/etcd --initial-advertise-peer-urls=https://16.0.14.118:2380 --initial-cluster=mip-bd-vm659.mip.s
torage.hpecorp.net=https://16.0.14.117:2380,mip-bd-vm660.mip.storage.hpecorp.net=https://16.0.14.118:2380 --key-file=/etc/kubernetes/pk
i/etcd/server.key --listen-client-urls=https://127.0.0.1:2379,https://16.0.14.118:2379 --listen-peer-urls=https://16.0.14.118:2380 --na
me=mip-bd-vm660.mip.storage.hpecorp.net --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt --peer-client-cert-auth=true --peer-key-file
=/etc/kubernetes/pki/etcd/peer.key --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt --snapshot-count=10000 --trusted-ca-file=/etc
/kubernetes/pki/etcd/ca.crt

Oct 29 04:35:31 mip-bd-vm660.mip.storage.hpecorp.net etcd[29931]: {"level":"error","ts":"2021-10-29T04:35:31.382-0700","caller":"raftht
tp/util.go:99","msg":"request sent was ignored due to cluster ID mismatch","remote-peer-id":"2cb93120384b98dc","remote-peer-cluster-id"
:"8c300cb900906703","local-member-cluster-id":"633c49ff49c16784","stacktrace":"go.etcd.io/etcd/server/v3/etcdserver/api/rafthttp.checkP
ostResponse\n\t/home/prow/go/src/github.hpe.com/hpe/ezkube/projects/etcd/etcd/server/etcdserver/api/rafthttp/util.go:99\ngo.etcd.io/etc
d/server/v3/etcdserver/api/rafthttp.(*pipeline).post\n\t/home/prow/go/src/github.hpe.com/hpe/ezkube/projects/etcd/etcd/server/etcdserve
r/api/rafthttp/pipeline.go:163\ngo.etcd.io/etcd/server/v3/etcdserver/api/rafthttp.(*pipeline).handle\n\t/home/prow/go/src/github.hpe.co
m/hpe/ezkube/projects/etcd/etcd/server/etcdserver/api/rafthttp/pipeline.go:100"}
Oct 29 04:35:31 mip-bd-vm660.mip.storage.hpecorp.net etcd[29931]: {"level":"warn","ts":"2021-10-29T04:35:31.430-0700","caller":"rafthtt
p/stream.go:653","msg":"request sent was ignored by remote peer due to cluster ID mismatch","remote-peer-id":"2cb93120384b98dc","remote

We are planning to set up 2nd etcd node and join to the existing 1st etcd node. Our use case is to expand the etcd cluster. Any pointers on how to handle the above errors.

The text was updated successfully, but these errors were encountered:

ahrtr · 2021-11-05T23:00:27Z

I see lots of people asked this question. So it's worthwhile to deliver a summary for this.

Firstly, you need to understand how the cluster ID is generated. The workflow is depicted in the diagram.

Secondly, once you understand the above diagram/workflow, then the flag "--initial-cluster-state" is the key point. If there is local data, then it doesn't matter what the value for the flag. But if there is no local data, such as for a brand new member, then it matters. Usually when joining into an existing member, you should set "--initial-cluster-state existing".

stale · 2022-02-06T12:15:58Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

serathius · 2022-02-08T11:48:53Z

@ahrtr do you think it would be worth documenting --initial-cluster-state so it will be easier to understand for users?

ahrtr · 2022-02-08T23:27:09Z

@ahrtr do you think it would be worth documenting --initial-cluster-state so it will be easier to understand for users?

Yes, it makes sense. But I am not sure where is the best place to document this. Probably I can write a blog post? There is already a FAQ item on What does the etcd warning “request ignored (cluster ID mismatch)” mean? , we can add the blog post link into the FAQ item, what do you think?

hmilkovi · 2022-02-15T14:44:50Z

I also have the same issue with docker when I try to bootstrap new cluster:

{"level":"warn","ts":"2022-02-15T14:43:59.005Z","caller":"rafthttp/stream.go:653","msg":"request sent was ignored by remote peer due to cluster ID mismatch","remote-peer-id":"2c506e52bc451d34","remote-peer-cluster-id":"dba92b3a17cfe072","local-member-id":"3e2342fa21204127","local-member-cluster-id":"905641018400954b","error":"cluster ID mismatch"}

ahrtr · 2022-02-18T21:45:12Z

@hmilkovi Please provide detailed reproduce steps.

ahrtr · 2022-03-11T09:10:16Z

FYI. https://github.com/ahrtr/etcd-issues/blob/master/docs/cluster_id_mismatch.md

stale · 2022-06-12T23:12:27Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

stale bot added the stale label Feb 6, 2022

stale bot removed the stale label Feb 8, 2022

stale bot added the stale label Jun 12, 2022

stale bot closed this as completed Jul 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2nd etcd node not communicating with 1st etcd node: cluster ID mismatch error #13453

2nd etcd node not communicating with 1st etcd node: cluster ID mismatch error #13453

prakashmirji commented Oct 29, 2021

ahrtr commented Nov 5, 2021

stale bot commented Feb 6, 2022

serathius commented Feb 8, 2022

ahrtr commented Feb 8, 2022

hmilkovi commented Feb 15, 2022

ahrtr commented Feb 18, 2022

ahrtr commented Mar 11, 2022

stale bot commented Jun 12, 2022

2nd etcd node not communicating with 1st etcd node: cluster ID mismatch error #13453

2nd etcd node not communicating with 1st etcd node: cluster ID mismatch error #13453

Comments

prakashmirji commented Oct 29, 2021

ahrtr commented Nov 5, 2021

stale bot commented Feb 6, 2022

serathius commented Feb 8, 2022

ahrtr commented Feb 8, 2022

hmilkovi commented Feb 15, 2022

ahrtr commented Feb 18, 2022

ahrtr commented Mar 11, 2022

stale bot commented Jun 12, 2022