From 1b4484361ee3d172e52e0e142ade4d1b8297eed1 Mon Sep 17 00:00:00 2001 From: Tobias Schottdorf Date: Fri, 20 Oct 2017 14:50:20 +0200 Subject: [PATCH] storage: de-flake TestRefreshPendingCommands The test ran a for loop without preemption points. The loop checked a condition that would only become true after another goroutine had been scheduled and carried out its job. If, with only few cores (four in my case) GC kicked in before that other goroutine got scheduled, that loop would just run hot forever until the test timed out, and the resulting stack dump looked quite unhelpful. Add a small sleep so the runtime can preempt the goroutine. The issue was harder to run into when stressing only the test, since there was less garbage available at that point. Adding some print statements, I accidentally made it much more likely. Previously flaked (got stuck) within <500iters, now ran past 1.5k without problems. Fixes #19397. Fixes #19388. Touches #19367. Fixes #18554. --- pkg/storage/client_raft_test.go | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/pkg/storage/client_raft_test.go b/pkg/storage/client_raft_test.go index f3ec7c2f766e..b3b233facf78 100644 --- a/pkg/storage/client_raft_test.go +++ b/pkg/storage/client_raft_test.go @@ -1090,6 +1090,14 @@ func TestRefreshPendingCommands(t *testing.T) { for i := 0; i < 2; i++ { draining = draining && mtc.stores[i].IsDraining() } + // If this goroutine can't be preempted, and we don't have other threads + // available (for example since you only have four and the other three + // are in GC, or you even only have one to begin with!), it can end up + // spinning forever. In the case that prompted adding this, there are + // four cores and "gc assist" got us stuck. + // + // See #18554. + time.Sleep(time.Nanosecond) } mtc.advanceClock(context.Background())