Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[0.10.0-rc1] Cannot restart node with -join args specified #5492

Closed
jwilder opened this issue Feb 1, 2016 · 3 comments
Closed

[0.10.0-rc1] Cannot restart node with -join args specified #5492

jwilder opened this issue Feb 1, 2016 · 3 comments
Milestone

Comments

@jwilder
Copy link
Contributor

jwilder commented Feb 1, 2016

When creating a cluster, nodes that are restarted while still specifying -join args get stuck and won't boot until the are started without them.

@jwilder jwilder added this to the 0.10.0 milestone Feb 1, 2016
@jwilder jwilder changed the title [0.10.0] Cannot restart node with -join args specified [0.10.0-rc1] Cannot restart node with -join args specified Feb 1, 2016
@PierreF
Copy link
Contributor

PierreF commented Feb 2, 2016

This is not fixed when using the default provided configuration file (/etc/influxdb/influxdb.conf).

E.g., it does NOT work when starting with

node1$ influxdb -config /etc/influxdb/influxdb.conf
node2$ influxdb -config /etc/influxdb/influxdb.conf -join node1:8091

but does work with:

node1$ influxdb
node2$ influxdb -join node1:8091

After few test & trial, it seems that the issue is the "dir" option:

# directory where server ID and cluster metaservers information will be kept
dir = "/var/lib/influxdb"

When commented or left to empty string, it working.
Just changing the value is not enough to be in working or not-working case. I had to remove /var/lib/influxdb/* between run.

Step to reproduce:

  • All node are fresh, just recreated.
node1# ls -la /var/lib/influxdb
total 8
drwxr-xr-x  2 influxdb influxdb 4096 Feb  2 08:03 .
drwxr-xr-x 27 root     root     4096 Feb  2 09:39 ..
node1# env | grep INFLUXDB_
INFLUXDB_META_BIND_ADDRESS=node1:8088
INFLUXDB_META_HTTP_BIND_ADDRESS=node1:8091
INFLUXDB_HTTP_BIND_ADDRESS=node1:8086

node2# ls -la /var/lib/influxdb
total 8
drwxr-xr-x  2 influxdb influxdb 4096 Feb  2 08:03 .
drwxr-xr-x 27 root     root     4096 Feb  2 09:39 ..
node2# env | grep INFLUXDB_
INFLUXDB_META_BIND_ADDRESS=node2:8088
INFLUXDB_META_HTTP_BIND_ADDRESS=node2:8091
INFLUXDB_HTTP_BIND_ADDRESS=node2:8086
  • Start the cluster for the first time:
node1# influxd -config /etc/influxdb/influxdb.conf 
2016/02/02 11:12:30 InfluxDB starting, version 0.10.0.n1454400035, branch master, commit fd85ae5b7fada9142aac66603200e6cc124f9aed, built 2016-02-02T08:01:46.972758
2016/02/02 11:12:30 Go version go1.4.3, GOMAXPROCS set to 4
2016/02/02 11:12:31 Using configuration at: /etc/influxdb/influxdb.conf
[meta] 2016/02/02 11:12:31 Starting meta service
[meta] 2016/02/02 11:12:31 Listening on HTTP: 172.19.0.2:8091
[metastore] 2016/02/02 11:12:31 Using data dir: /var/lib/influxdb/meta
[metastore] 2016/02/02 11:12:31 Node at node1:8088 [Follower]
[metastore] 2016/02/02 11:12:32 Node at node1:8088 [Leader]. peers=[node1:8088]
[...]


node2# influxd -config /etc/influxdb/influxdb.conf -join influxdb1:8091
2016/02/02 11:13:42 InfluxDB starting, version 0.10.0.n1454400035, branch master, commit fd85ae5b7fada9142aac66603200e6cc124f9aed, built 2016-02-02T08:01:46.972758
2016/02/02 11:13:42 Go version go1.4.3, GOMAXPROCS set to 4
2016/02/02 11:13:42 Using configuration at: /etc/influxdb/influxdb.conf
[meta] 2016/02/02 11:13:42 Starting meta service
[meta] 2016/02/02 11:13:42 Listening on HTTP: 172.19.0.3:8091
[metastore] 2016/02/02 11:13:42 Using data dir: /var/lib/influxdb/meta
[metastore] 2016/02/02 11:13:42 Node at node2:8088 [Follower]
[...]
2016/02/02 11:13:52 updated node metaservers with: [node1:8091 node2:8091]
  • shutdown node2 then node1.
  • Restart node1 then node2:
node1# influxd -config /etc/influxdb/influxdb.conf 
2016/02/02 11:15:06 InfluxDB starting, version 0.10.0.n1454400035, branch master, commit fd85ae5b7fada9142aac66603200e6cc124f9aed, built 2016-02-02T08:01:46.972758
2016/02/02 11:15:06 Go version go1.4.3, GOMAXPROCS set to 4
2016/02/02 11:15:06 Using configuration at: /etc/influxdb/influxdb.conf
[meta] 2016/02/02 11:15:06 Starting meta service
[meta] 2016/02/02 11:15:06 Listening on HTTP: 172.19.0.2:8091
[metastore] 2016/02/02 11:15:06 Using data dir: /var/lib/influxdb/meta
[metastore] 2016/02/02 11:15:06 Node at node1:8088 [Follower]

<no more message, even after node2 startup>


node2# influxd -config /etc/influxdb/influxdb.conf -join node1:8091
2016/02/02 11:15:41 InfluxDB starting, version 0.10.0.n1454400035, branch master, commit fd85ae5b7fada9142aac66603200e6cc124f9aed, built 2016-02-02T08:01:46.972758
2016/02/02 11:15:41 Go version go1.4.3, GOMAXPROCS set to 4
2016/02/02 11:15:41 Using configuration at: /etc/influxdb/influxdb.conf
[meta] 2016/02/02 11:15:41 Starting meta service
[meta] 2016/02/02 11:15:41 Listening on HTTP: 172.19.0.3:8091
[metastore] 2016/02/02 11:15:41 Using data dir: /var/lib/influxdb/meta
  • Stop both node.
  • Drop data and update config:
node1##rm -vfr /var/lib/influxdb/*  
removed directory: '/var/lib/influxdb/data/_internal/monitor/1'
removed directory: '/var/lib/influxdb/data/_internal/monitor'
removed directory: '/var/lib/influxdb/data/_internal'
removed directory: '/var/lib/influxdb/data'
removed directory: '/var/lib/influxdb/hh'
removed '/var/lib/influxdb/meta/raft.db'
removed '/var/lib/influxdb/meta/peers.json'
removed directory: '/var/lib/influxdb/meta/snapshots'
removed directory: '/var/lib/influxdb/meta'
removed '/var/lib/influxdb/node.json'
removed '/var/lib/influxdb/wal/_internal/monitor/1/_00001.wal'
removed directory: '/var/lib/influxdb/wal/_internal/monitor/1'
removed directory: '/var/lib/influxdb/wal/_internal/monitor'
removed directory: '/var/lib/influxdb/wal/_internal'
removed directory: '/var/lib/influxdb/wal'

node2# rm -fr /var/lib/influxdb/*

node1# sed -i 's@^dir = "/var/lib/influxdb"$@#dir = "/var/lib/influxdb"@' /etc/influxdb/influxdb.conf
node2# sed -i 's@^dir = "/var/lib/influxdb"$@#dir = "/var/lib/influxdb"@' /etc/influxdb/influxdb.conf
  • Redo previous step:
  • Start cluster for first time:
node1# influxd -config /etc/influxdb/influxdb.conf 
node2# influxd -config /etc/influxdb/influxdb.conf -join node1:8091
  • Stop cluster
  • Restart both node:
node1# influxd -config /etc/influxdb/influxdb.conf 
2016/02/02 11:20:32 InfluxDB starting, version 0.10.0.n1454400035, branch master, commit fd85ae5b7fada9142aac66603200e6cc124f9aed, built 2016-02-02T08:01:46.972758
2016/02/02 11:20:32 Go version go1.4.3, GOMAXPROCS set to 4
2016/02/02 11:20:32 Using configuration at: /etc/influxdb/influxdb.conf
[meta] 2016/02/02 11:20:32 Starting meta service
[meta] 2016/02/02 11:20:32 Listening on HTTP: 172.19.0.2:8091
[metastore] 2016/02/02 11:20:32 Using data dir: /var/lib/influxdb/meta
[metastore] 2016/02/02 11:20:32 Node at node1:8088 [Follower]

< node 2 started >
[metastore] 2016/02/02 11:20:57 Node at node1:8088 [Leader]. peers=[node2:8088 node1:8088]
[meta] 2016/02/02 11:20:57 172.19.0.2 - - [02/Feb/2016:11:20:57 +0000] GET /?index=0 HTTP/1.1 200 170 - Go 1.1 package http 07bd0cdd-c99f-11e5-8001-000000000000 1.401581ms
[store] 2016/02/02 11:20:57 Using data dir: /var/lib/influxdb/data
[tsm1wal] 2016/02/02 11:20:57 tsm1 WAL starting with 10485760 segment size
[...]

node2# # influxd -config /etc/influxdb/influxdb.conf -join node1:8091
2016/02/02 11:20:55 InfluxDB starting, version 0.10.0.n1454400035, branch master, commit fd85ae5b7fada9142aac66603200e6cc124f9aed, built 2016-02-02T08:01:46.972758
2016/02/02 11:20:55 Go version go1.4.3, GOMAXPROCS set to 4
2016/02/02 11:20:55 Using configuration at: /etc/influxdb/influxdb.conf
[meta] 2016/02/02 11:20:55 Starting meta service
[meta] 2016/02/02 11:20:55 Listening on HTTP: 172.19.0.3:8091
[metastore] 2016/02/02 11:20:55 Using data dir: /var/lib/influxdb/meta
[metastore] 2016/02/02 11:20:55 Node at node2:8088 [Follower]
[meta] 2016/02/02 11:20:57 172.19.0.3 - - [02/Feb/2016:11:20:57 +0000] GET /?index=0 HTTP/1.1 200 170 - Go 1.1 package http 07c29ea2-c99f-11e5-8001-000000000000 1.005385ms
[store] 2016/02/02 11:20:57 Using data dir: /var/lib/influxdb/data
[...]

@PierreF
Copy link
Contributor

PierreF commented Feb 2, 2016

Note: the only change I see was where "node.json" is:

  • With dir="/var/lib/influxdb", node.json is /var/lib/influxdb/node.json
  • Without dir, node.json is /var/lib/influxdb/meta/node.json

@PierreF
Copy link
Contributor

PierreF commented Feb 3, 2016

It is now fixed also with default /etc/influxdb/influxdb.conf file. Probably by PR #5515

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants