kvserver,rpc: set the stage for maintaining a local blocklist of permanently removed nodes #54936

tbg · 2020-09-29T11:44:02Z

As part of long-running migrations (#54903) we need to make sure, to the best of our abilities, that decommissioned nodes can not return to the cluster. We will achieve this by persisting this knowledge locally (in an eventually consistent way, via gossip) and ading checks in the RPC layer.

This PR stops short of adding the actual storage but sets the stage for doing so.

First seven commits are from another PR and should be ignored here.

cockroach-teamcity · 2020-09-29T11:44:21Z

This change is

knz

I just reviewed r1, r2, r3, r6, r12, r13, r14. I'll let irfan review the rest.

In r13-r14 I recommend extending the commit message (and/or code comments) to remind the reader that Ping is used prior to actually using the RPC connection for inter-node traffic, so that its logic effectively gates the use of a connection. This wasn't clear at first.

Reviewed 1 of 1 files at r1, 3 of 3 files at r2, 2 of 2 files at r3, 2 of 4 files at r4, 2 of 2 files at r6, 6 of 11 files at r7, 5 of 5 files at r12, 3 of 3 files at r13, 4 of 4 files at r14.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif, @knz, and @tbg)

pkg/cli/node.go, line 341 at r6 (raw file):

	c := serverpb.NewAdminClient(conn)
	if err := runDecommissionNodeImpl(ctx, c, nodeCtx.nodeDecommissionWait, nodeIDs); err != nil {
		cause := errors.Cause(err)

UnwrapAll is clearer IMHO. Cause is only provided for compatibility with pkg/errors.

pkg/cli/node.go, line 583 at r6 (raw file):

		// ValidateLivenessTransition in kvserverpb/liveness.go for where this
		// error is generated.
		if s, ok := status.FromError(err); ok && s.Code() == codes.FailedPrecondition {

I'd recommend extracting cause := errors.UnwrapAll(err) for consistency with thge other case.

Maybe even worth introducing a FromError wrapper in our own grpcutil package and have that unwrap in all cases.

pkg/kv/kvserver/node_liveness.go, line 278 at r2 (raw file):

	// best effort attempt at marking the node as draining. It can return
	// without an error after trying unsuccessfully a few times. Is that what we
	// want?

As the last person interested in this function, I didn't appreciate that DefaultRetryOptions had a finite timeout. With a retry-forever, this code should be fine?
If it does, then there's a bug I agree.

irfansharif

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @knz)

pkg/cli/node.go, line 341 at r6 (raw file):

Previously, knz (kena) wrote…

UnwrapAll is clearer IMHO. Cause is only provided for compatibility with pkg/errors.

Done in #54544. I keep getting lost in the very broad API net have in cockroachdb/errors because of these compatibility constraints, I need to sit down and go through all of it once for good.

pkg/cli/node.go, line 583 at r6 (raw file):

Previously, knz (kena) wrote…

I'd recommend extracting cause := errors.UnwrapAll(err) for consistency with thge other case.

Maybe even worth introducing a FromError wrapper in our own grpcutil package and have that unwrap in all cases.

Done in #54544. Ditto the comment above, lately I've found it faster to just throw some random error handling variant from cockroachdb/errors and relying on the reviewers to teach me where I'm wrong (so I'm clearly pretty confused by it all).

pkg/kv/kvserver/node_liveness.go, line 278 at r2 (raw file):

Previously, knz (kena) wrote…

As the last person interested in this function, I didn't appreciate that DefaultRetryOptions had a finite timeout. With a retry-forever, this code should be fine?
If it does, then there's a bug I agree.

Patched in #54544 (6b43996).

irfansharif

This all LGTM.

I did manage to confuse myself a bit though: Contrasting it with what we were thinking earlier with persisting a "prevent startup file" (and with crdb node decommission --force) like we did in #54373, I realize the guarantees provided by that approach are a bit stronger (unless I'm missing something). Because there we persist the DECOMMISSIONED.txt file remotely first before updating the liveness record, we could have the target node (the one being decommissioned) intentionally stop sending out PingRequests as part of that FinalizeDecommission API. I think it would have the same effect as this one in that all live connections are abandoned (except of course it would no longer be best effort). The target node would also be unable to restart because of that file, so that problem too goes away.

As for when the target node is permanently downed (and --force is specified), then I think we'd need the safeguards we're introducing in this PR (stopping outgoing pings to the decomm'ed nodes, which we might as well do all the time).

So I guess my question is, do we still need something like #54373? These failure modes do seem exceedingly rare, so I'm unsure.

irfansharif · 2020-09-29T22:06:18Z

pkg/kv/kvserver/node_liveness.go

+// Note that there is yet another struct, NodeLivenessStartOptions, which
+// is supplied when the instance is started. This is necessary as during
+// server startup, some inputs can only be constructed at Start time. The
+// separation has grown organically and various option could in principle


s/option/options

irfansharif · 2020-09-29T22:07:13Z

pkg/kv/kvserver/node_liveness.go

+type NodeLivenessStartOptions struct {
+	Stopper *stop.Stopper
+	Engines []storage.Engine
+	// OnSelfLive is invoked after every successful heartbeat


Could move this comment to NodeLiveness.mu.onSelfLive, probably the first place to look.

irfansharif · 2020-09-29T22:10:08Z

pkg/kv/kvserver/node_liveness.go

@@ -1290,6 +1305,12 @@ func (nl *NodeLiveness) maybeUpdate(ctx context.Context, newLivenessRec Liveness
 			fn(newLivenessRec.Liveness)
 		}
 	}
+	if newLivenessRec.Membership == kvserverpb.MembershipStatus_DECOMMISSIONED {


We have a helper, newLivenessRec.Membership.Decommissioned().

irfansharif · 2020-09-29T23:01:17Z

pkg/rpc/context.go

 				// NB: We want the request to fail-fast (the default), otherwise we won't
 				// be notified of transport failures.
-				response, err = heartbeatClient.Ping(goCtx, request)
+				err := interceptor(request)


[nit]

if err := interceptor(req); err != nil { return err } // ...

irfansharif · 2020-09-29T23:01:21Z

pkg/rpc/context.go

+
+	// NB: OnSendPing and OnHandlePing default to noops.
+	// This is used both for testing and the cli.
+	_, _ = c.OnSendPing, c.OnHandlePing


Could we just set them to noops here, during validation/setting defaults, instead of down below?

tbg

Dismissed @knz from 3 discussions.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @knz)

pkg/kv/kvserver/node_liveness.go, line 243 at r10 (raw file):