Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: fix LookupJoinDef interning, add tests #29117

Merged
merged 1 commit into from
Aug 27, 2018

Conversation

RaduBerinde
Copy link
Member

Fixing an omission I noticed in internLookupJoinDef and adding missing
tests for interning defs.

Release note: None

Fixing an omission I noticed in internLookupJoinDef and adding missing
tests for interning defs.

Release note: None
@RaduBerinde RaduBerinde requested a review from a team as a code owner August 27, 2018 14:45
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

@RaduBerinde
Copy link
Member Author

bors r+

craig bot pushed a commit that referenced this pull request Aug 27, 2018
29067: storage: protect ComputeChecksum commands from replaying r=tschottdorf a=benesch

Previously, a ComputeChecksum command could apply twice with the same
ID. Consider the following sequence of events:

  1. DistSender sends a ComputeChecksum request to a replica.
  2. The request is succesfully evaluated and proposed, but a connection
     error occurs.
  3. DistSender retries the request, leaving the checksum ID unchanged!

This would result in two ComputeChecksum commands with the same checksum
ID in the Raft log. Somewhat amazingly, this typically wasn't
problematic. If all replicas were online and reasonably up-to-date,
they'd see the first ComputeChecksum command, compute its checksum, and
store it in the checksums map. When they saw the duplicated
ComputeChecksum command, they'd see that a checksum with that ID already
existed and ignore it. In effect, only the first ComputeChecksum command
for a given checksum ID mattered.

The problem occured when one replica saw one ComputeChecksum command but
not the other. There were two ways this could occur. A replica could go
offline after computing the checksum the first time; when it came back
online, it would have an empty checksum map, and the checksum computed
for the second ComputeChecksum command would be recorded instead. Or a
replica could receive a snapshot that advanced it past one
ComputeChecksum but not the other. In both cases, the replicas could
spuriously fail a consistency check.

A very similar problem occured with range merges because ComputeChecksum
requests are incorrectly ranged (see #29002). That means DistSender
might split a ComputeChecksum request in two. Consider what happens when
a consistency check occurs immediately after a merge: the
ComputeChecksum request is generated using the up-to-date, post-merge
descriptor, but DistSender might have the pre-merge descriptors cached,
and so it splits the batch in two. Both halves of the batch would get
routed to the same range, and both halves would have the same command
ID, resulting in the same duplicated ComputeChecksum command problem.

The fix for these problems is to assign the checksum ID when the
ComputeChecksum request is evaluated. If the request is retried, it will
be properly assigned a new checksum ID. Note that we don't need to worry
about reproposals causing duplicate commands, as the MaxLeaseIndex
prevents proposals from replay.

The version compatibility story here is straightforward. The
ReplicaChecksumVersion is bumped, so v2.0 nodes will turn
ComputeChecksum requests proposed by v2.1 nodes into a no-op, and
vice-versa. The consistency queue will spam some complaints into the log
about this--it will time out while collecting checksums--but this will
stop as soon as all nodes have been upgraded to the new version.†

Note that this commit takes the opportunity to migrate
storagebase.ReplicatedEvalResult.ComputeChecksum from
roachpb.ComputeChecksumRequest to a dedicated
storagebase.ComputeChecksum message. Separate types are more in line
with how the merge/split/change replicas triggers work and avoid
shipping unnecessary fields through Raft. Note that even though this
migration changes logic downstream of Raft, it's safe. v2.1 nodes will
turn any ComputeChecksum commands that were commited by v2.0 nodes into
no-ops, and vice-versa, but the only effect of this will be some
temporary consistency queue spam. As an added bonus, because  we're
guaranteed that we'll never see duplicate v2.1-style ComputeChecksum
commands, we can properly fatal if we ever see a ComputeChecksum request
with a checksum ID that we've already computed.

† It would be possible to put the late-ID allocation behind a cluster
version to avoid the log spam, but that amounts to allowing v2.1 to
initiate known-buggy consistency checks. A bit of log spam seems
preferable.

Fix #28995.

29083: storage: fix raft snapshots that span merges and splits r=tschottdorf a=benesch

The code that handles Raft snapshots that span merges did not account
for snapshots that spanned merges AND splits. Handle this case by
allowing snapshot subsumption even when the snapshot's end key does not
exactly match the end of an existing replica. See the commits within the
patch for details.

Fix #29080.

Release note: None

29117: opt: fix LookupJoinDef interning, add tests r=RaduBerinde a=RaduBerinde

Fixing an omission I noticed in internLookupJoinDef and adding missing
tests for interning defs.

Release note: None

Co-authored-by: Nikhil Benesch <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
@craig
Copy link
Contributor

craig bot commented Aug 27, 2018

Build succeeded

@craig craig bot merged commit 2545589 into cockroachdb:master Aug 27, 2018
@RaduBerinde RaduBerinde deleted the fix-lookup-join-interning branch August 27, 2018 16:10
craig bot pushed a commit that referenced this pull request Aug 27, 2018
29077: backport-2.1: storage,kv: remove some allocations r=jordanlewis a=jordanlewis

Backport 1/1 commits from #29075.

/cc @cockroachdb/release

---

Guard an expensive log.V in a log.ExpensiveLogEnabled.

Pass roachpb.Lease by value in one situation in which allocating it is
pointless.

cc @benesch

This seems to improve read-only kv performance by almost 5%!

Release note: None


29103: release-2.1: base: make TestValidateAddrs portable r=knz a=knz

Backport 1/1 commits from #29101.

/cc @cockroachdb/release

---

On (at least) osx the port name resolution error is different. This patch
modifies the test to catch that.

Release note: None


29121: release-2.1: opt: fix LookupJoinDef interning, add tests r=RaduBerinde a=RaduBerinde

Backport 1/1 commits from #29117.

/cc @cockroachdb/release

---

Fixing an omission I noticed in internLookupJoinDef and adding missing
tests for interning defs.

Release note: None


Co-authored-by: Jordan Lewis <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants