Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Etcd (3.0.4) doesn't start after system reboot. #6612

Closed
tomasramanauskas opened this issue Oct 8, 2016 · 6 comments
Closed

Etcd (3.0.4) doesn't start after system reboot. #6612

tomasramanauskas opened this issue Oct 8, 2016 · 6 comments

Comments

@tomasramanauskas
Copy link

tomasramanauskas commented Oct 8, 2016

Bug reporting

Hello, we had a outage that caused Etcd cluster reboot. We have 3 members in the Etcd cluster and Etcd doesn't start on each of them three servers. I get this error:

2016-10-08 10:52:06.782409 I | etcdmain: etcd Version: 3.0.4
2016-10-08 10:52:06.782497 I | etcdmain: Git SHA: d53923c
2016-10-08 10:52:06.782512 I | etcdmain: Go Version: go1.6.3
2016-10-08 10:52:06.782526 I | etcdmain: Go OS/Arch: linux/amd64
2016-10-08 10:52:06.782545 I | etcdmain: setting maximum number of CPUs to 2, total number of available CPUs is 2
2016-10-08 10:52:06.782629 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2016-10-08 10:52:06.782799 I | etcdmain: listening for peers on http://127.0.0.1:2380
2016-10-08 10:52:06.782897 I | etcdmain: listening for client requests on 127.0.0.1:2379
2016-10-08 10:52:06.788906 I | etcdserver: recovered store from snapshot at index 9760976
2016-10-08 10:52:06.788945 I | etcdserver: name = sprcom-prod-etcd-02
2016-10-08 10:52:06.788959 I | etcdserver: data dir = /var/cache/etcd/state
2016-10-08 10:52:06.788972 I | etcdserver: member dir = /var/cache/etcd/state/member
2016-10-08 10:52:06.788985 I | etcdserver: heartbeat = 100ms
2016-10-08 10:52:06.788997 I | etcdserver: election = 1000ms
2016-10-08 10:52:06.789008 I | etcdserver: snapshot count = 10000
2016-10-08 10:52:06.789027 I | etcdserver: advertise client URLs = http://127.0.0.1:2379
2016-10-08 10:52:07.498888 I | etcdserver: restarting member 4679ad1b5fc91709 in cluster b3f86e2c8726fd14 at commit index 9761441
2016-10-08 10:52:07.500487 I | raft: 4679ad1b5fc91709 became follower at term 35428
2016-10-08 10:52:07.500677 I | raft: newRaft 4679ad1b5fc91709 [peers: [4679ad1b5fc91709,94fab184616e55d6,f11145700b13951c], term: 35428, commit: 9761441, applied: 9760976, lastindex: 9761441, lastterm: 14245]
2016-10-08 10:52:07.501172 I | api: enabled capabilities for version 2.3
2016-10-08 10:52:07.501251 I | membership: added member 4679ad1b5fc91709 [http://10.230.33.164:2380] to cluster b3f86e2c8726fd14 from store
2016-10-08 10:52:07.501280 I | membership: added member 94fab184616e55d6 [http://10.230.33.162:2380] to cluster b3f86e2c8726fd14 from store
2016-10-08 10:52:07.501311 I | membership: added member f11145700b13951c [http://10.230.33.165:2380] to cluster b3f86e2c8726fd14 from store
2016-10-08 10:52:07.501347 I | membership: set the cluster version to 2.3 from store
2016-10-08 10:52:07.510210 I | etcdmain: stopping listening for client requests on 127.0.0.1:2379
2016-10-08 10:52:07.510338 I | etcdmain: stopping listening for peers on http://127.0.0.1:2380
2016-10-08 10:52:07.510382 C | etcdmain: database file (/var/cache/etcd/state/member/snap/db index 0) does not match with snapshot (index 9760976).

I see there is a similar bug #5857, but I don't know if this this the same issue.

@xiang90
Copy link
Contributor

xiang90 commented Oct 8, 2016

This is the same issue. Should already be fixed.

@gyuho
Copy link
Contributor

gyuho commented Oct 8, 2016

@xiang90
Copy link
Contributor

xiang90 commented Oct 11, 2016

Closing this one since I am pretty sure it is fixed.

@xiang90 xiang90 closed this as completed Oct 11, 2016
@karankh
Copy link

karankh commented Apr 23, 2017

@xiang90 Hey we are using etcd 3.0.13 and still see this issue. Not sure if the reproduce steps mentioned above match ours but still see same logs, is it possible that even after your fix, this issue can happens?
2017-04-23 07:33:42.897819 I | mvcc: restore compact to 174613902 2017-04-23 07:33:44.218517 I | etcdmain: stopping listening for client requests on 0.0.0.0:2379 2017-04-23 07:33:44.218881 I | etcdmain: stopping listening for peers on https://0.0.0.0:2380 2017-04-23 07:33:44.218920 C | etcdmain: database file (/var/etcd/member/snap/db index 857034582) does not match with snapshot (index 858149718).

@garyyang85
Copy link

@xiang90 the issue still exists on 3.0.17

@YSunLIN
Copy link

YSunLIN commented Aug 20, 2018

@xiang90 the issue still exists on 3.3.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants