storage: panic in `raft.campaign` #20629

bdarnell · 2017-12-11T20:38:07Z

From #20494 (this has been seen in jepsen runs with multiple nemeses including strobe-skews and majority-ring):

W171209 08:51:50.342206 413 storage/replica_range_lease.go:418  can't determine lease status due to node liveness error: node not in the liveness table
W171209 08:51:50.518491 102 storage/store.go:1289  [n1,s1,r7/1:/Table/{SystemCon…-11}] could not gossip system config: [NotLeaseHolderError] r7: replica (n1,s1):1 not lease holder; lease holder unknown
E171209 08:51:51.089407 472 util/log/crash_reporting.go:119  [n1,s1,r31/1:/{Table/50-Max}] a panic has occurred!
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1261959]

goroutine 472 [running]:
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).Recover(0xc42076a2d0, 0x312fb00, 0xc425724380)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:181 +0x136
panic(0x1e890a0, 0x30e36f0)
	/usr/local/go/src/runtime/panic.go:491 +0x283
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*Progress).maybeUpdate(...)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/progress.go:116
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).appendEntry(0xc4207523c0, 0xc4256ee5a0, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:547 +0xd9
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).becomeLeader(0xc4207523c0)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:648 +0x287
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).campaign(0xc4207523c0, 0x20d5e4a, 0x10)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:671 +0x668
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).Step(0xc4207523c0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:783 +0x33a
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*RawNode).Campaign(0xc4256fc2c0, 0xc420496d80, 0x3)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/rawnode.go:140 +0x71
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).withRaftGroupLocked(0xc420161b00, 0x14fe9438e254d601, 0x21b2a38, 0x0, 0x200000002)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:594 +0x38e
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).maybeInitializeRaftGroup(0xc420161b00, 0x312fbc0, 0xc4256e93b0)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:1849 +0x185
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).Send(0xc420161b00, 0x312fbc0, 0xc4256e93b0, 0x14fe943c314d23c9, 0x0, 0x0, 0x0, 0x1f, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:1873 +0x136
github.com/cockroachdb/cockroach/pkg/storage.(*pendingLeaseRequest).requestLeaseAsync.func2(0x312fb00, 0xc425724380)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica_range_lease.go:274 +0x577
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask.func1(0xc42076a2d0, 0x312fb00, 0xc425724380, 0xc4207d0500, 0x35, 0x0, 0x0, 0xc4207b9b80)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:267 +0xe6
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:262 +0x133

The text was updated successfully, but these errors were encountered:

bdarnell · 2017-12-11T21:54:19Z

This appears to be related to table creation: it happens more often in multi-table tests (bank-multitable, g2, sequential), although it has been seen in the single-table register test. Nemeses shouldn't matter here because it happens before the nemeses start. In each instance I've examined, it occurs on /{Table/50-Max}, the range created by the first userspace table. I've seen it on nodes 1, 2, and 3.

The stack trace indicates that at the time we initialize our raft group, raft believes the group has one member, and it's not us. This is presumably some sort of race between the table-creation splits and the initial upreplication.

bdarnell · 2017-12-11T22:50:54Z

Ah, there may not be a nemesis, but there's this restart. So we're killing the cluster immediately after creating a database (which creates the split here). The "initiating a split at" message is seen in the logs right before node 1 is killed in some of the instances, but not all because jepsen uses kill -9 so the logs don't get flushed.

I'll trace through one of the panics:

We see node 2 applying many preemptive snapshots, suggesting that the initial splits all happened on node 1 before the cluster became replicated:

I171211 09:35:54.793338 234 storage/replica_raftstorage.go:733  [n2,s2,r17/?:{-}] applying preemptive snapshot at index 13 (id=63b62346, encoded size=1000, 1 rocksdb batches, 3 log entries)
I171211 09:35:54.800401 234 storage/replica_raftstorage.go:739  [n2,s2,r17/?:/{Table/20-Max}] applied preemptive snapshot in 7ms [clear=0ms batch=0ms entries=0ms commit=7ms]

As the initial splitting and rebalancing finishes, node 2 gains the lease of r17 (Table/20-Max), removes a replica on node 5, then initiates a split at Table/50. It is then killed (potentially losing log messages) and restarts:

I171211 09:35:57.578335 135 storage/replica_proposal.go:195  [n2,s2,r17/2:/{Table/20-Max}] new range lease repl=(n2,s2):2 start=1512984957.545605788,0 epo=1 pro=1512984957.545608462,0 following repl=(n1,s1):1 start=0.000000000,0 epo=1 pro=1512984956.536052108,0
I171211 09:35:58.069607 170 storage/replica_command.go:2153  [replicate,n2,s2,r17/2:/{Table/20-Max}] change replicas (REMOVE_REPLICA (n5,s5):4): read existing descriptor r17:/{Table/20-Max} [(n1,s1):1, (n2,s2):2, (n3,s3):3, (n5,s5):4, next=5]
I171211 09:35:58.127338 436 storage/replica.go:3161  [n2,s2,r17/2:/{Table/20-Max}] proposing REMOVE_REPLICA((n5,s5):4): updated=[(n1,s1):1 (n2,s2):2 (n3,s3):3] next=5
I171211 09:35:58.537156 169 storage/replica_command.go:1231  [split,n2,s2,r17/2:/{Table/20-Max}] initiating a split of this range at key /Table/50 [r31]
W171211 09:36:04.058098 1 cli/start.go:885  RUNNING IN INSECURE MODE!

Node 5 was successfully removed and nothing else happened in its logs; I don't think it is relevant to this story.

I171211 09:35:58.143754 352 storage/store.go:3287  [n5,s5,r17/4:/{Table/20-Max}] added to replica GC queue (peer suggestion)
I171211 09:35:58.160953 228 storage/store.go:2157  [replicaGC,n5,s5,r17/4:/{Table/20-Max}] removing replica
I171211 09:35:58.175600 228 storage/replica.go:805  [replicaGC,n5,s5,r17/4:/{Table/20-Max}] removed 12 (0+12) keys in 15ms [clear=0ms commit=15ms]

r17 and r31 should initially have n1, n2, and n3 as members. Panics later occur on n2 and n3. After the restart, n1 gets the lease for r17 (note that this is after the split finished: /Table/{20-50}).

I171211 09:36:09.215818 99 storage/replica_proposal.go:195  [n1,s1,r17/1:/Table/{20-50}] new range lease repl=(n1,s1):1 start=1512984969.146990212,0 epo=2 pro=1512984969.147027549,0 following repl=(n2,s2):2 start=1512984957.545605788,0 epo=1 pro=1512984957.545608462,0

This is the only mention of r17 or r31 on any node after the restart until the panics.

E171211 09:36:11.143317 410 util/log/crash_reporting.go:119  [n3,s3,r31/3:/{Table/50-Max}] a panic has occurred!
E171211 09:36:11.400620 410 util/log/crash_reporting.go:369  [n3,s3,r31/3:/{Table/50-Max}] Reported as error 20c8f50aeb214fc58005c1afdfb983e0
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x12635f9]

goroutine 410 [running]:
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).Recover(0xc42079e3f0, 0x313aba0, 0xc425cce640)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:181 +0x136
panic(0x1e90d00, 0x30ee6f0)
	/usr/local/go/src/runtime/panic.go:491 +0x283
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*Progress).maybeUpdate(...)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/progress.go:116
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).appendEntry(0xc42009d950, 0xc425cd13b0, 0x1, 0x1)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:547 +0xd9
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).becomeLeader(0xc42009d950)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:648 +0x287
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).campaign(0xc42009d950, 0x20de0b2, 0x10)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:671 +0x668
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*raft).Step(0xc42009d950, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/raft.go:783 +0x33a
github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft.(*RawNode).Campaign(0xc425cce6c0, 0xc4204525d0, 0x3)
	/go/src/github.com/cockroachdb/cockroach/vendor/github.com/coreos/etcd/raft/rawnode.go:140 +0x71
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).withRaftGroupLocked(0xc4207e0300, 0x14ff33cd8103ea01, 0x21baf10, 0x0, 0x200000002)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:594 +0x38e
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).maybeInitializeRaftGroup(0xc4207e0300, 0x313ac60, 0xc425c75500)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:1849 +0x185
github.com/cockroachdb/cockroach/pkg/storage.(*Replica).Send(0xc4207e0300, 0x313ac60, 0xc425c75500, 0x14ff33d0ab7e00bd, 0x0, 0x0, 0x0, 0x1f, 0x0, 0x0, ...)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica.go:1873 +0x136
github.com/cockroachdb/cockroach/pkg/storage.(*pendingLeaseRequest).requestLeaseAsync.func2(0x313aba0, 0xc425cce640)
	/go/src/github.com/cockroachdb/cockroach/pkg/storage/replica_range_lease.go:274 +0x577
github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask.func1(0xc42079e3f0, 0x313aba0, 0xc425cce640, 0xc425cd40c0, 0x35, 0x0, 0x0, 0xc424fb9280)
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:267 +0xe6
created by github.com/cockroachdb/cockroach/pkg/util/stop.(*Stopper).RunAsyncTask
	/go/src/github.com/cockroachdb/cockroach/pkg/util/stop/stopper.go:262 +0x133

bdarnell · 2017-12-11T23:13:41Z

Found it! replicaRaftStorage.InitialState assumes that ranges without a persisted HardState have no members.

cockroach/pkg/storage/replica_raftstorage.go

Lines 64 to 67 in fae9323

    
           // For uninitialized ranges, membership is unknown at this point. 
        
           if raft.IsEmptyHardState(hs) || err != nil { 
        
           	return raftpb.HardState{}, raftpb.ConfState{}, err 
        
           }

But after a split, the HardState is written separately from the main batch that writes the range descriptor, in splitPostApply:

cockroach/pkg/storage/store.go

Line 1831 in fae9323

if err := rsl.SynthesizeRaftState(ctx, r.store.Engine()); err != nil {

We need to either move the call to SynthesizeRaftState from splitPostApply to where it can be included in the main command batch (we used to do it at evaluation time, but that's not safe with propEvalKV. Maybe we can put it in applyRaftCommand?), or synthesize the appropriate defaults when no HardState has been written (We used to do that but changed to always write a HardState. cc @tschottdorf)

tbg · 2017-12-11T23:18:25Z

Good find! I think it should be straightforward to make the synthesis a pre-apply trigger similar to the one for SSTable ingestion. applyRaftCommand would have to take two write batches (or you Apply-Write-Repr the original batch, which may be just fine since it's only on splits).

Prior to this change, an ill-timed crash (between applying the raft command and calling splitPostApply) would leave the replica in a persistently broken state (no HardState). Found via jepsen. Fixes cockroachdb#20629 Fixes cockroachdb#20494 Release note (bugfix): Fixed a replica corruption that could occur if a process crashed in the middle of a range split.

bdarnell self-assigned this Dec 11, 2017

bdarnell mentioned this issue Dec 12, 2017

roachtest: combined split+kill chaos #20651

Closed

bdarnell mentioned this issue Dec 13, 2017

storage: Write HardState atomically with committing splits #20704

Merged

bdarnell closed this as completed in #20704 Dec 16, 2017

bdarnell mentioned this issue Dec 20, 2017

cherrypick-1.1: storage: Write HardState atomically with committing splits #20939

Merged

bdarnell mentioned this issue Jan 20, 2018

storage: panic: tocommit(28) is out of range [lastIndex(0)] in StartTestCluster #20764

Closed

bdarnell mentioned this issue Apr 2, 2018

storage: Replica.withRaftGroupLocked panic #24406

Closed

petermattis mentioned this issue Nov 20, 2018

teamcity: failed test: version-upgrade #32479

Closed

bdarnell mentioned this issue Oct 16, 2019

storage: panic from raftgroup.Campaign call to becomeLeader #41588

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: panic in `raft.campaign` #20629

storage: panic in `raft.campaign` #20629

bdarnell commented Dec 11, 2017

bdarnell commented Dec 11, 2017

bdarnell commented Dec 11, 2017

bdarnell commented Dec 11, 2017

tbg commented Dec 11, 2017 •

edited

Loading

storage: panic in raft.campaign #20629

storage: panic in raft.campaign #20629

Comments

bdarnell commented Dec 11, 2017

bdarnell commented Dec 11, 2017

bdarnell commented Dec 11, 2017

bdarnell commented Dec 11, 2017

tbg commented Dec 11, 2017 • edited Loading

storage: panic in `raft.campaign` #20629

storage: panic in `raft.campaign` #20629

tbg commented Dec 11, 2017 •

edited

Loading