Etcd (3.0.4) doesn't start after system reboot. #6612

tomasramanauskas · 2016-10-08T11:04:22Z

Bug reporting

Hello, we had a outage that caused Etcd cluster reboot. We have 3 members in the Etcd cluster and Etcd doesn't start on each of them three servers. I get this error:

2016-10-08 10:52:06.782409 I | etcdmain: etcd Version: 3.0.4
2016-10-08 10:52:06.782497 I | etcdmain: Git SHA: d53923c
2016-10-08 10:52:06.782512 I | etcdmain: Go Version: go1.6.3
2016-10-08 10:52:06.782526 I | etcdmain: Go OS/Arch: linux/amd64
2016-10-08 10:52:06.782545 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2
2016-10-08 10:52:06.782629 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2016-10-08 10:52:06.782799 I | etcdmain: listening for peers on http://127.0.0.1:2380
2016-10-08 10:52:06.782897 I | etcdmain: listening for client requests on 127.0.0.1:2379
2016-10-08 10:52:06.788906 I | etcdserver: recovered store from snapshot at index 9760976
2016-10-08 10:52:06.788945 I | etcdserver: name = sprcom-prod-etcd-02
2016-10-08 10:52:06.788959 I | etcdserver: data dir = /var/cache/etcd/state
2016-10-08 10:52:06.788972 I | etcdserver: member dir = /var/cache/etcd/state/member
2016-10-08 10:52:06.788985 I | etcdserver: heartbeat = 100ms
2016-10-08 10:52:06.788997 I | etcdserver: election = 1000ms
2016-10-08 10:52:06.789008 I | etcdserver: snapshot count = 10000
2016-10-08 10:52:06.789027 I | etcdserver: advertise client URLs = http://127.0.0.1:2379
2016-10-08 10:52:07.498888 I | etcdserver: restarting member 4679ad1b5fc91709 in cluster b3f86e2c8726fd14 at commit index 9761441
2016-10-08 10:52:07.500487 I | raft: 4679ad1b5fc91709 became follower at term 35428
2016-10-08 10:52:07.500677 I | raft: newRaft 4679ad1b5fc91709 [peers: [4679ad1b5fc91709,94fab184616e55d6,f11145700b13951c], term: 35428, commit: 9761441, applied: 9760976, lastindex: 9761441, lastterm: 14245]
2016-10-08 10:52:07.501172 I | api: enabled capabilities for version 2.3
2016-10-08 10:52:07.501251 I | membership: added member 4679ad1b5fc91709 [http://10.230.33.164:2380] to cluster b3f86e2c8726fd14 from store
2016-10-08 10:52:07.501280 I | membership: added member 94fab184616e55d6 [http://10.230.33.162:2380] to cluster b3f86e2c8726fd14 from store
2016-10-08 10:52:07.501311 I | membership: added member f11145700b13951c [http://10.230.33.165:2380] to cluster b3f86e2c8726fd14 from store
2016-10-08 10:52:07.501347 I | membership: set the cluster version to 2.3 from store
2016-10-08 10:52:07.510210 I | etcdmain: stopping listening for client requests on 127.0.0.1:2379
2016-10-08 10:52:07.510338 I | etcdmain: stopping listening for peers on http://127.0.0.1:2380
2016-10-08 10:52:07.510382 C | etcdmain: database file (/var/cache/etcd/state/member/snap/db index 0) does not match with snapshot (index 9760976).

I see there is a similar bug #5857, but I don't know if this this the same issue.

The text was updated successfully, but these errors were encountered:

xiang90 · 2016-10-08T11:07:00Z

This is the same issue. Should already be fixed.

gyuho · 2016-10-08T17:10:44Z

@tomasramanauskas Could you try the latest release? https://github.com/coreos/etcd/releases/tag/v3.0.12

xiang90 · 2016-10-11T18:21:44Z

Closing this one since I am pretty sure it is fixed.

karankh · 2017-04-23T07:43:29Z

@xiang90 Hey we are using etcd 3.0.13 and still see this issue. Not sure if the reproduce steps mentioned above match ours but still see same logs, is it possible that even after your fix, this issue can happens?
2017-04-23 07:33:42.897819 I | mvcc: restore compact to 174613902 2017-04-23 07:33:44.218517 I | etcdmain: stopping listening for client requests on 0.0.0.0:2379 2017-04-23 07:33:44.218881 I | etcdmain: stopping listening for peers on https://0.0.0.0:2380 2017-04-23 07:33:44.218920 C | etcdmain: database file (/var/etcd/member/snap/db index 857034582) does not match with snapshot (index 858149718).

garyyang85 · 2017-12-20T06:13:11Z

@xiang90 the issue still exists on 3.0.17

YSunLIN · 2018-08-20T06:59:04Z

@xiang90 the issue still exists on 3.3.8

xiang90 closed this as completed Oct 11, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Etcd (3.0.4) doesn't start after system reboot. #6612

Etcd (3.0.4) doesn't start after system reboot. #6612

tomasramanauskas commented Oct 8, 2016 •

edited

Loading

xiang90 commented Oct 8, 2016

gyuho commented Oct 8, 2016

xiang90 commented Oct 11, 2016

karankh commented Apr 23, 2017

garyyang85 commented Dec 20, 2017

YSunLIN commented Aug 20, 2018

Etcd (3.0.4) doesn't start after system reboot. #6612

Etcd (3.0.4) doesn't start after system reboot. #6612

Comments

tomasramanauskas commented Oct 8, 2016 • edited Loading

Bug reporting

xiang90 commented Oct 8, 2016

gyuho commented Oct 8, 2016

xiang90 commented Oct 11, 2016

karankh commented Apr 23, 2017

garyyang85 commented Dec 20, 2017

YSunLIN commented Aug 20, 2018

tomasramanauskas commented Oct 8, 2016 •

edited

Loading