Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: panic in raft.campaign #20629

Closed
bdarnell opened this issue Dec 11, 2017 · 4 comments · Fixed by #20704
Closed

storage: panic in raft.campaign #20629

bdarnell opened this issue Dec 11, 2017 · 4 comments · Fixed by #20704
Assignees

Comments

@bdarnell
Copy link
Contributor

From #20494 (this has been seen in jepsen runs with multiple nemeses including strobe-skews and majority-ring):

W171209 08:51:50.342206 413 storage/replica_range_lease.go:418  can't determine lease status due to node liveness error: node not in the liveness table
W171209 08:51:50.518491 102 storage/store.go:1289  [n1,s1,r7/1:/Table/{SystemCon…-11}] could not gossip system config: [NotLeaseHolderError] r7: replica (n1,s1):1 not lease holder; lease holder unknown
E171209 08:51:51.089407 472 util/log/crash_reporting.go:119  [n1,s1,r31/1:/{Table/50-Max}] a panic has occurred!
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1261959]

goroutine 472 [running]:
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).Recover(0xc42076a2d0, 0x312fb00, 0xc425724380)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:181 +0x136
panic(0x1e890a0, 0x30e36f0)
	/usr/local/go/src/runtime/panic.go:491 +0x283
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*Progress).maybeUpdate(...)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/progress.go:116
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).appendEntry(0xc4207523c0, 0xc4256ee5a0, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:547 +0xd9
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).becomeLeader(0xc4207523c0)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:648 +0x287
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).campaign(0xc4207523c0, 0x20d5e4a, 0x10)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:671 +0x668
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).Step(0xc4207523c0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:783 +0x33a
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*RawNode).Campaign(0xc4256fc2c0, 0xc420496d80, 0x3)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/rawnode.go:140 +0x71
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).withRaftGroupLocked(0xc420161b00, 0x14fe9438e254d601, 0x21b2a38, 0x0, 0x200000002)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:594 +0x38e
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).maybeInitializeRaftGroup(0xc420161b00, 0x312fbc0, 0xc4256e93b0)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:1849 +0x185
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).Send(0xc420161b00, 0x312fbc0, 0xc4256e93b0, 0x14fe943c314d23c9, 0x0, 0x0, 0x0, 0x1f, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:1873 +0x136
github.com/cockroachdb/cockroach/pkg/storage.(*pendingLeaseRequest).requestLeaseAsync.func2(0x312fb00, 0xc425724380)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica_range_lease.go:274 +0x577
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask.func1(0xc42076a2d0, 0x312fb00, 0xc425724380, 0xc4207d0500, 0x35, 0x0, 0x0, 0xc4207b9b80)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:267 +0xe6
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:262 +0x133
@bdarnell bdarnell self-assigned this Dec 11, 2017
@bdarnell
Copy link
Contributor Author

This appears to be related to table creation: it happens more often in multi-table tests (bank-multitable, g2, sequential), although it has been seen in the single-table register test. Nemeses shouldn't matter here because it happens before the nemeses start. In each instance I've examined, it occurs on /{Table/50-Max}, the range created by the first userspace table. I've seen it on nodes 1, 2, and 3.

The stack trace indicates that at the time we initialize our raft group, raft believes the group has one member, and it's not us. This is presumably some sort of race between the table-creation splits and the initial upreplication.

@bdarnell
Copy link
Contributor Author

Ah, there may not be a nemesis, but there's this restart. So we're killing the cluster immediately after creating a database (which creates the split here). The "initiating a split at" message is seen in the logs right before node 1 is killed in some of the instances, but not all because jepsen uses kill -9 so the logs don't get flushed.

I'll trace through one of the panics:

We see node 2 applying many preemptive snapshots, suggesting that the initial splits all happened on node 1 before the cluster became replicated:

I171211 09:35:54.793338 234 storage/replica_raftstorage.go:733  [n2,s2,r17/?:{-}] applying preemptive snapshot at index 13 (id=63b62346, encoded size=1000, 1 rocksdb batches, 3 log entries)
I171211 09:35:54.800401 234 storage/replica_raftstorage.go:739  [n2,s2,r17/?:/{Table/20-Max}] applied preemptive snapshot in 7ms [clear=0ms batch=0ms entries=0ms commit=7ms]

As the initial splitting and rebalancing finishes, node 2 gains the lease of r17 (Table/20-Max), removes a replica on node 5, then initiates a split at Table/50. It is then killed (potentially losing log messages) and restarts:

I171211 09:35:57.578335 135 storage/replica_proposal.go:195  [n2,s2,r17/2:/{Table/20-Max}] new range lease repl=(n2,s2):2 start=1512984957.545605788,0 epo=1 pro=1512984957.545608462,0 following repl=(n1,s1):1 start=0.000000000,0 epo=1 pro=1512984956.536052108,0
I171211 09:35:58.069607 170 storage/replica_command.go:2153  [replicate,n2,s2,r17/2:/{Table/20-Max}] change replicas (REMOVE_REPLICA (n5,s5):4): read existing descriptor r17:/{Table/20-Max} [(n1,s1):1, (n2,s2):2, (n3,s3):3, (n5,s5):4, next=5]
I171211 09:35:58.127338 436 storage/replica.go:3161  [n2,s2,r17/2:/{Table/20-Max}] proposing REMOVE_REPLICA((n5,s5):4): updated=[(n1,s1):1 (n2,s2):2 (n3,s3):3] next=5
I171211 09:35:58.537156 169 storage/replica_command.go:1231  [split,n2,s2,r17/2:/{Table/20-Max}] initiating a split of this range at key /Table/50 [r31]
W171211 09:36:04.058098 1 cli/start.go:885  RUNNING IN INSECURE MODE!

Node 5 was successfully removed and nothing else happened in its logs; I don't think it is relevant to this story.

I171211 09:35:58.143754 352 storage/store.go:3287  [n5,s5,r17/4:/{Table/20-Max}] added to replica GC queue (peer suggestion)
I171211 09:35:58.160953 228 storage/store.go:2157  [replicaGC,n5,s5,r17/4:/{Table/20-Max}] removing replica
I171211 09:35:58.175600 228 storage/replica.go:805  [replicaGC,n5,s5,r17/4:/{Table/20-Max}] removed 12 (0+12) keys in 15ms [clear=0ms commit=15ms]

r17 and r31 should initially have n1, n2, and n3 as members. Panics later occur on n2 and n3. After the restart, n1 gets the lease for r17 (note that this is after the split finished: /Table/{20-50}).

I171211 09:36:09.215818 99 storage/replica_proposal.go:195  [n1,s1,r17/1:/Table/{20-50}] new range lease repl=(n1,s1):1 start=1512984969.146990212,0 epo=2 pro=1512984969.147027549,0 following repl=(n2,s2):2 start=1512984957.545605788,0 epo=1 pro=1512984957.545608462,0

This is the only mention of r17 or r31 on any node after the restart until the panics.

E171211 09:36:11.143317 410 util/log/crash_reporting.go:119  [n3,s3,r31/3:/{Table/50-Max}] a panic has occurred!
E171211 09:36:11.400620 410 util/log/crash_reporting.go:369  [n3,s3,r31/3:/{Table/50-Max}] Reported as error 20c8f50aeb214fc58005c1afdfb983e0
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x12635f9]

goroutine 410 [running]:
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).Recover(0xc42079e3f0, 0x313aba0, 0xc425cce640)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:181 +0x136
panic(0x1e90d00, 0x30ee6f0)
	/usr/local/go/src/runtime/panic.go:491 +0x283
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*Progress).maybeUpdate(...)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/progress.go:116
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).appendEntry(0xc42009d950, 0xc425cd13b0, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:547 +0xd9
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).becomeLeader(0xc42009d950)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:648 +0x287
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).campaign(0xc42009d950, 0x20de0b2, 0x10)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:671 +0x668
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).Step(0xc42009d950, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:783 +0x33a
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*RawNode).Campaign(0xc425cce6c0, 0xc4204525d0, 0x3)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/rawnode.go:140 +0x71
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).withRaftGroupLocked(0xc4207e0300, 0x14ff33cd8103ea01, 0x21baf10, 0x0, 0x200000002)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:594 +0x38e
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).maybeInitializeRaftGroup(0xc4207e0300, 0x313ac60, 0xc425c75500)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:1849 +0x185
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).Send(0xc4207e0300, 0x313ac60, 0xc425c75500, 0x14ff33d0ab7e00bd, 0x0, 0x0, 0x0, 0x1f, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:1873 +0x136
github.com/cockroachdb/cockroach/pkg/storage.(*pendingLeaseRequest).requestLeaseAsync.func2(0x313aba0, 0xc425cce640)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica_range_lease.go:274 +0x577
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask.func1(0xc42079e3f0, 0x313aba0, 0xc425cce640, 0xc425cd40c0, 0x35, 0x0, 0x0, 0xc424fb9280)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:267 +0xe6
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:262 +0x133

@bdarnell
Copy link
Contributor Author

Found it! replicaRaftStorage.InitialState assumes that ranges without a persisted HardState have no members.

// For uninitialized ranges, membership is unknown at this point.
if raft.IsEmptyHardState(hs) || err != nil {
return raftpb.HardState{}, raftpb.ConfState{}, err
}

But after a split, the HardState is written separately from the main batch that writes the range descriptor, in splitPostApply:

if err := rsl.SynthesizeRaftState(ctx, r.store.Engine()); err != nil {

We need to either move the call to SynthesizeRaftState from splitPostApply to where it can be included in the main command batch (we used to do it at evaluation time, but that's not safe with propEvalKV. Maybe we can put it in applyRaftCommand?), or synthesize the appropriate defaults when no HardState has been written (We used to do that but changed to always write a HardState. cc @tschottdorf)

@tbg
Copy link
Member

tbg commented Dec 11, 2017

Good find! I think it should be straightforward to make the synthesis a pre-apply trigger similar to the one for SSTable ingestion. applyRaftCommand would have to take two write batches (or you Apply-Write-Repr the original batch, which may be just fine since it's only on splits).

bdarnell added a commit to bdarnell/cockroach that referenced this issue Dec 13, 2017
Prior to this change, an ill-timed crash (between applying the raft
command and calling splitPostApply) would leave the replica in a
persistently broken state (no HardState).

Found via jepsen.

Fixes cockroachdb#20629
Fixes cockroachdb#20494

Release note (bugfix): Fixed a replica corruption that could occur if
a process crashed in the middle of a range split.
bdarnell added a commit to bdarnell/cockroach that referenced this issue Dec 14, 2017
Prior to this change, an ill-timed crash (between applying the raft
command and calling splitPostApply) would leave the replica in a
persistently broken state (no HardState).

Found via jepsen.

Fixes cockroachdb#20629
Fixes cockroachdb#20494

Release note (bugfix): Fixed a replica corruption that could occur if
a process crashed in the middle of a range split.
bdarnell added a commit to bdarnell/cockroach that referenced this issue Dec 20, 2017
Prior to this change, an ill-timed crash (between applying the raft
command and calling splitPostApply) would leave the replica in a
persistently broken state (no HardState).

Found via jepsen.

Fixes cockroachdb#20629
Fixes cockroachdb#20494

Release note (bugfix): Fixed a replica corruption that could occur if
a process crashed in the middle of a range split.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants