Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add expected RangeDescriptor to AdminChangeReplicas and adopt #36244

Merged

Conversation

ajwerner
Copy link
Contributor

This PR comes in two commits. The first updates the roachpb api and the second adopts the update. This is the first of the PRs which will address #36025. See the commit messages for more details.

@ajwerner ajwerner requested a review from a team March 27, 2019 20:57
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@ajwerner ajwerner force-pushed the ajwerner/admin-change-replicas-exp-desc branch 5 times, most recently from 98b0202 to 800146f Compare March 28, 2019 14:34
@ajwerner ajwerner requested a review from nvanbenschoten March 29, 2019 14:21
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1, 11 of 11 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/internal/client/db.go, line 509 at r2 (raw file):

		return nil, err
	}
	responses := b.RawResponse().Responses

nit: I would change this to be b.response.Responses, which is what Batch.fillResults uses. This is all a little bit of an abuse because we should be generating this response in Batch.fillResults instead given that this is not a "raw" request.

Another option would be to add another response variant to Result. I'm not sure what type you'd give it (other interface{}?), but would fit in with the current approach of some responses populating Result.Rows and others populating Result.Keys. Then you could populate this in Batch.fillResults and pull the result in db.AdminChangeReplicas from getOneResult.


pkg/roachpb/api.proto, line 688 at r2 (raw file):

  RequestHeader header = 1 [(gogoproto.nullable) = false, (gogoproto.embed) = true];
  ReplicaChangeType change_type = 2;
  repeated ReplicationTarget targets = 3 [(gogoproto.nullable) = false];

We should explain the semantics of this field. Critically, mention that the sequence of changes will can be partially applied if it races with other replica movement, but will always be applied in order.

You might also consider adding a test similar to TestConcurrentAdminChangeReplicasRequests that constructs a sequence of replication targets such that there are multiple valid histories for two racing AdminChangeReplicasRequests. Something like:

origDesc = 1
actor 1: expDesc: 1, targets: [2, 3, 4]
actor 2: expDesc: 2, targets: [3, 2, 1]

You'd then assert that the final target is either 1 or 4.


pkg/storage/client_replica_test.go, line 2020 at r2 (raw file):

		if err == nil {
			numSuccesses++
		}

Let's assert that the only error we get is what we expect when the expected desc doesn't match.


pkg/storage/replica.go, line 1223 at r2 (raw file):

		}
		for _, target := range tArgs.Targets {
			expDesc, err = r.ChangeReplicas(

Could you add a comment about why we set expDesc to the result of each previous step?


pkg/storage/replica_command.go, line 1092 at r2 (raw file):

	}

	// After each call to the

This needs some love.


pkg/storage/replica_command.go, line 1116 at r2 (raw file):

	// updateRangeDesc updates the passed RangeDescriptor following the successful
	// completion of an AdminChangeReplicasRequest with	the single provided target

Strange tab here.

@ajwerner ajwerner force-pushed the ajwerner/admin-change-replicas-exp-desc branch from 800146f to c9fa75d Compare March 29, 2019 20:48
Copy link
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @nvanbenschoten)


pkg/internal/client/db.go, line 509 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: I would change this to be b.response.Responses, which is what Batch.fillResults uses. This is all a little bit of an abuse because we should be generating this response in Batch.fillResults instead given that this is not a "raw" request.

Another option would be to add another response variant to Result. I'm not sure what type you'd give it (other interface{}?), but would fit in with the current approach of some responses populating Result.Rows and others populating Result.Keys. Then you could populate this in Batch.fillResults and pull the result in db.AdminChangeReplicas from getOneResult.

I went with the first suggestion. Muddying up Results further didn't feel clean .


pkg/roachpb/api.proto, line 688 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We should explain the semantics of this field. Critically, mention that the sequence of changes will can be partially applied if it races with other replica movement, but will always be applied in order.

You might also consider adding a test similar to TestConcurrentAdminChangeReplicasRequests that constructs a sequence of replication targets such that there are multiple valid histories for two racing AdminChangeReplicasRequests. Something like:

origDesc = 1
actor 1: expDesc: 1, target: [2, 3, 4]
actor 2: expDesc: 2, target: [3, 2, 1]

You'd then assert that the final target is either 1 or 4.

I added this but it needs to be

origDesc = 1
actor 1: expDesc: 1, target: [2, 3, 4]
actor 2: expDesc: 2, target: [3, 5]

Because it's an error to add an existing entry.


pkg/storage/client_replica_test.go, line 2020 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Let's assert that the only error we get is what we expect when the expected desc doesn't match.

Done. It also turns out this test was too pessimistic because it was written when I had the racing goroutines each do a read before they did their write but in that scenario multiple can succeed. It'd be fancy if I built some code to verify that valid history occurred. For now I made it such that it guarantees that only one succeeds and that it succeeds completely.

I also added the test you suggested above.


pkg/storage/replica_command.go, line 1092 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This needs some love.

Actually just stale, removed.

…onse

This commit adds a new field, ExpDesc, to AdminChageReplicasRequest which is
the client-provided expectation for the current state of the range descriptor.
In the subsequent commit logic will be added to enforce that if the state of
the range descriptor does not match the expectation then the call will fail.

The commit also adds a new response field which will be populated with the value
of the RangeDescriptor upon successful completion of the request.

Both of these fields are nullable and optional for the 19.1 release but will
become required for 19.2.
@ajwerner ajwerner force-pushed the ajwerner/admin-change-replicas-exp-desc branch from c9fa75d to d10bc53 Compare March 29, 2019 21:35
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 13 files at r3, 11 of 11 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @nvanbenschoten)

@ajwerner ajwerner force-pushed the ajwerner/admin-change-replicas-exp-desc branch 3 times, most recently from 30b4ca2 to 6facc5e Compare March 30, 2019 00:38
…ateRange

This commit adds a new ExpDesc RangeDescriptor field to the
AdminChangeReplicasRequest which is provided as the expected current
RangeDescriptor to the ChangeReplicas call. Providing this argument prevents
hazards caused by concurrent calls to ChangeReplicas. Given that the
AdminChangeReplicas RPC allows clients to specify multiple changes, the
provided expectation only directly applies to the first change. Subsequent
calls to ChangeReplicas use the value of the RangeDescriptor as returned from
the previous call to ChangeReplicas thus any concurrent modification of the
RangeDescriptor will lead to a failure. The change also adds adds a response
field to the AdminChangeReplicasResponse which is the updated value of the
RangeDescriptor upon successful completion of the request.

The client.DB interface is updated to adopt these new fields.

Lastly this change is adopted inside of AdminRelocateRange to prevent dangerous
replication scenarios as observed in cockroachdb#36025.

Release note: None
@ajwerner ajwerner force-pushed the ajwerner/admin-change-replicas-exp-desc branch from 6facc5e to 6356005 Compare March 30, 2019 00:57
@ajwerner
Copy link
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request Mar 30, 2019
36244: Add expected RangeDescriptor to AdminChangeReplicas and adopt r=ajwerner a=ajwerner

This PR comes in two commits. The first updates the roachpb api and the second adopts the update. This is the first of the PRs which will address #36025. See the commit messages for more details. 

Co-authored-by: Andrew Werner <[email protected]>
@craig
Copy link
Contributor

craig bot commented Mar 30, 2019

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants