Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: Replica.String, Replica.Desc, and Replica logging refactor #7871

Merged
merged 3 commits into from
Jul 18, 2016

Conversation

rjnn
Copy link
Contributor

@rjnn rjnn commented Jul 17, 2016

Fixes #7852.


This change is Reviewable

@petermattis
Copy link
Collaborator

:lgtm: modulo comments.


Review status: 0 of 3 files reviewed at latest revision, 4 unresolved discussions, all commit checks successful.


storage/replica.go, line 207 [r1] (raw file):

      // The state of the Raft state machine.
      state storagebase.ReplicaState
      // rangeDesc is a rangeDescriptor that can be atomically read from. All

s/rangeDescriptor/*RangeDescriptor/g


storage/replica.go, line 208 [r1] (raw file):

      state storagebase.ReplicaState
      // rangeDesc is a rangeDescriptor that can be atomically read from. All
      // updates to state.desc should be duplicated here (as is done in

s/state.desc/state.Desc/g


storage/replica.go, line 633 [r1] (raw file):

// Desc returns the range's descriptor.
func (r *Replica) Desc() *roachpb.RangeDescriptor {
  return r.mu.rangeDesc.Load().(*roachpb.RangeDescriptor)

@bdarnell I'm not familiar enough with the subtleties of this code to know if this is kosher.

It would be a more conservative change to leave this method untouched and to change String() to use r.mu.rangeDesc.Load().(*roachpb.RangeDescriptor).


storage/replica.go, line 787 [r1] (raw file):

  var err error
  if ri.LastVerification, err = r.getLastVerificationTimestamp(); err != nil {
      log.Warningf("range %s: %v", r, err)

You should get rid of the range prefix everywhere. As it it currently is, the log messages will look like range node=1 store=2 range=3


Comments from Reviewable

@tamird
Copy link
Contributor

tamird commented Jul 18, 2016

Reviewed 3 of 3 files at r1.
Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


storage/replica.go, line 211 [r1] (raw file):

      // updateRangeDescriptorLocked()) so that it can be read without acquiring
      // the lock.
      rangeDesc atomic.Value

why is it better to have this in two places? either we need atomic-with-other-things updates or we don't.


storage/replica.go, line 396 [r1] (raw file):

func (r *Replica) String() string {
  desc := r.Desc()
  return fmt.Sprintf("node=%d store=%d range=%d [%s-%s)", r.store.Ident.NodeID, r.store.Ident.StoreID, desc.RangeID, desc.StartKey, desc.EndKey)

why not %+v the ident?


Comments from Reviewable

@petermattis
Copy link
Collaborator

Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


storage/replica.go, line 396 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why not %+v the ident?

`roachpb.StoreIdent` includes the cluster ID which isn't useful to print in every log message. But instead of printing `node=%d store=%d`, we could use `Store.String()`:
  return fmt.Sprintf("%s range=%d [%s-%s)", r.store, d

Comments from Reviewable

@petermattis
Copy link
Collaborator

Review status: all files reviewed at latest revision, 6 unresolved discussions, all commit checks successful.


storage/replica.go, line 396 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

roachpb.StoreIdent includes the cluster ID which isn't useful to print in every log message. But instead of printing node=%d store=%d, we could use Store.String():

  return fmt.Sprintf("%s range=%d [%s-%s)", r.store, d
That should read:
  return fmt.Sprintf("%s range=%d [%s-%s)", r.store, desc.RangeID, desc.StartKey, desc.EndKey)

Comments from Reviewable

Context: replica.String used to call replica.Desc, which used to
acquire the replica.mu lock. Its use was thus restricted to places
where the lock was not already acquired (see cockroachdb#7852).

Add a new atomically readable rangeDescriptor to replica, under
mu.rangeDesc, which is a duplicate of the existing mu.state.Desc.
mu.rangeDesc is updated alongside mu.state.Desc under the replica.mu
lock, but it is read freely without locking.

Change replica.String to use this new atomically readable non-locking
rangeDescriptor, so they no longer acquire any locks. Conservatively
leave replica.Desc unchanged.
@rjnn rjnn force-pushed the replica_desc_refactor branch from f82e4ca to 571033c Compare July 18, 2016 17:13
@rjnn
Copy link
Contributor Author

rjnn commented Jul 18, 2016

Review status: 1 of 3 files reviewed at latest revision, 6 unresolved discussions, some commit checks pending.


storage/replica.go, line 207 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

s/rangeDescriptor/*RangeDescriptor/g

Done.

storage/replica.go, line 208 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

s/state.desc/state.Desc/g

Done.

storage/replica.go, line 211 [r1] (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why is it better to have this in two places? either we need atomic-with-other-things updates or we don't.

See @bdarnell's comment on #7852 (or this commit message).

storage/replica.go, line 396 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

That should read:

  return fmt.Sprintf("%s range=%d [%s-%s)", r.store, desc.RangeID, desc.StartKey, desc.EndKey)
Done.

storage/replica.go, line 633 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

@bdarnell I'm not familiar enough with the subtleties of this code to know if this is kosher.

It would be a more conservative change to leave this method untouched and to change String() to use r.mu.rangeDesc.Load().(*roachpb.RangeDescriptor).

That's fair. I'm pretty sure it's kosher, but given our current stability issues, let's conservatively leave Desc() alone. Done.

storage/replica.go, line 787 [r1] (raw file):

Previously, petermattis (Peter Mattis) wrote…

You should get rid of the range prefix everywhere. As it it currently is, the log messages will look like range node=1 store=2 range=3

Done.

Comments from Reviewable

@petermattis
Copy link
Collaborator

Still :lgtm:, though I think there are some log calls in replica_raftstore.go that could be adjusted.


Review status: 1 of 3 files reviewed at latest revision, 2 unresolved discussions, some commit checks pending.


Comments from Reviewable

@rjnn rjnn force-pushed the replica_desc_refactor branch from 571033c to 485ca49 Compare July 18, 2016 17:34
@rjnn
Copy link
Contributor Author

rjnn commented Jul 18, 2016

Added changes in replica_raftstore.go. PTAL.


Review status: 1 of 4 files reviewed at latest revision, 2 unresolved discussions, some commit checks pending.


Comments from Reviewable

@petermattis
Copy link
Collaborator

Review status: 1 of 4 files reviewed at latest revision, 3 unresolved discussions, some commit checks pending.


storage/replica_raftstorage.go, line 435 [r3] (raw file):

  }

  log.Infof("generated snapshot at index %d in %s. encoded size=%d, %d KV pairs, %d log entries",

You need to include r in the prefix of this message.


Comments from Reviewable

Arjun Narayan added 2 commits July 18, 2016 14:04
replica.String used to acquire the replica.mu lock (via a call to
replica.Desc). It was therefore not used in places where the lock was
already acquired. replica.String no longer acquires this lock, so it
now safe to call liberally.

Refactor all logs in replica.go, replica_command.go, and
replica_raftstorage.go to prepend information from replica.String
(unless they already print context information).

Closes cockroachdb#7852.
@rjnn rjnn force-pushed the replica_desc_refactor branch from 485ca49 to 9af718e Compare July 18, 2016 18:04
@rjnn
Copy link
Contributor Author

rjnn commented Jul 18, 2016

storage/replica_raftstorage.go, line 435 [r3] (raw file):

Previously, petermattis (Peter Mattis) wrote…

You need to include r in the prefix of this message.

Huh, I accidentally clobbered that line. I've reverted it to the original version (since we do not in fact have a replica in this function, only a `roachpb.RangeID`)

Comments from Reviewable

@rjnn rjnn merged commit 509c454 into cockroachdb:master Jul 18, 2016
@bdarnell
Copy link
Contributor

:lgtm:


Review status: 1 of 4 files reviewed at latest revision, 3 unresolved discussions, all commit checks successful.


storage/replica.go, line 211 [r1] (raw file):

Previously, arjunravinarayan (Arjun Ravi Narayan) wrote…

See @bdarnell's comment on #7852 (or this commit message).

Ah, I forgot about the `ReplicaState` refactoring. I was thinking we'd change `r.mu.desc` instead of creating two copies of it. But since `ReplicaState` is a protobuf we can't do that.

If we're storing this in two places anyway then the atomic.Value copy should be outside the mu struct.


storage/replica.go, line 633 [r1] (raw file):

Previously, arjunravinarayan (Arjun Ravi Narayan) wrote…

That's fair. I'm pretty sure it's kosher, but given our current stability issues, let's conservatively leave Desc() alone. Done.

I think it would be safe to change `Desc()` and would prefer to do so. Then `Desc()` is the only thing that accesses the descriptor through `atomic.Value.Load`.

I think it's safe because anything that calls Desc() must already be tolerant of concurrent changes: they release the lock before actually doing anything with the descriptor. In cases where consistency of the descriptor matters we don't call Desc(); we get the lock and access r.mu.state.Descdirectly under that lock.


Comments from Reviewable

rjnn pushed a commit to rjnn/cockroach that referenced this pull request Jul 18, 2016
In cockroachdb#7871, replica.String() was refactored to use a new atomically
readable copy of the RangeDescriptor. replica.Desc() could use this as
well (so as to not lock replica.mu), but this change was
conservatively deferred. However, with the issue of replica.Desc()
acquiring locks being raised in cockroachdb#7888, and after discussion with
@bdarnell, it should be safe to refactor replica.Desc() as well.

Finally replica.mu.rangeDesc is moved outside mu to replica.rangeDesc
to maintain the semantics of replica.mu
@tamird
Copy link
Contributor

tamird commented Jul 19, 2016

Reviewed 2 of 2 files at r2, 1 of 1 files at r4.
Review status: all files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


Comments from Reviewable

rjnn pushed a commit to rjnn/cockroach that referenced this pull request Jul 19, 2016
In cockroachdb#7871, replica.String() was refactored to use a new atomically
readable copy of the RangeDescriptor. replica.Desc() could use this as
well (so as to not lock replica.mu), but this change was
conservatively deferred. However, with the issue of replica.Desc()
acquiring locks being raised in cockroachdb#7888, and after discussion with
@bdarnell, it should be safe to refactor replica.Desc() as well.

Finally replica.mu.rangeDesc is moved outside mu to replica.rangeDesc
to maintain the semantics of replica.mu
rjnn pushed a commit to rjnn/cockroach that referenced this pull request Jul 19, 2016
In cockroachdb#7871, replica.String() was refactored to use a new atomically
readable copy of the RangeDescriptor. replica.Desc() could use this as
well (so as to not lock replica.mu), but this change was
conservatively deferred. However, with the issue of replica.Desc()
acquiring locks being raised in cockroachdb#7888, and after discussion with
@bdarnell, it should be safe to refactor replica.Desc() as well.

Finally replica.mu.rangeDesc is moved outside mu to replica.rangeDesc
to maintain the semantics of replica.mu
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants