Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: deflake TestStoreRangeMergeReadoptedBothFollowers #29084

Merged
merged 1 commit into from
Aug 27, 2018

Conversation

benesch
Copy link
Contributor

@benesch benesch commented Aug 27, 2018

TestStoreRangeMergeReadoptedBothFollowers expects that attempting to
place merged range [a, c) onto a store that has orphaned pre-merge
replicas [a, b) and [b, c) fails with the following message:

cannot apply snapshot: snapshot intersects existing range

In rare cases, the snapshot application could instead fail with:

remote failed to apply snapshot for reason failed to apply snapshot: snapshot intersects existing range

The first message is generated by the canApplySnapshot check that occurs
when the reservation is made. The second message is generated by the
canApplySnapshot check that occurs after the snapshot is received.

The test, not unfairly, expects to only see the first message; it has
orphaned two intersecting replicas onto the receiving store, and those
intersecting replicas should fail the snapshot in the first check! It
turns out that, in rare cases, the snapshot could arrive at the
receiving store before the replicas were initialized. The snapshot would
slip through the first check, but receiving the snapshot would take some
time, and so the replicas would be initialized by the second check.

Teach the test to properly wait for the replicas on the receiving store
to be fully initialized before sending any snapshots.

Fix #29078.

Release note: None

TestStoreRangeMergeReadoptedBothFollowers expects that attempting to
place merged range [a, c) onto a store that has orphaned pre-merge
replicas [a, b) and [b, c) fails with the following message:

    cannot apply snapshot: snapshot intersects existing range

In rare cases, the snapshot application could instead fail with:

    remote failed to apply snapshot for reason failed to apply snapshot: snapshot intersects existing range

The first message is generated by the canApplySnapshot check that occurs
when the reservation is made. The second message is generated by the
canApplySnapshot check that occurs after the snapshot is received.

The test, not unfairly, expects to only see the first message; it has
orphaned two intersecting replicas onto the receiving store, and those
intersecting replicas should fail the snapshot in the first check! It
turns out that, in rare cases, the snapshot could arrive at the
receiving store before the replicas were initialized. The snapshot would
slip through the first check, but receiving the snapshot would take some
time, and so the replicas would be initialized by the second check.

Teach the test to properly wait for the replicas on the receiving store
to be fully initialized before sending any snapshots.

Fix cockroachdb#29078.

Release note: None
@benesch benesch requested review from tbg and a team August 27, 2018 04:41
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@benesch
Copy link
Contributor Author

benesch commented Aug 27, 2018

bors r=tschottdorf

craig bot pushed a commit that referenced this pull request Aug 27, 2018
29084: storage: deflake TestStoreRangeMergeReadoptedBothFollowers r=tschottdorf a=benesch

TestStoreRangeMergeReadoptedBothFollowers expects that attempting to
place merged range [a, c) onto a store that has orphaned pre-merge
replicas [a, b) and [b, c) fails with the following message:

    cannot apply snapshot: snapshot intersects existing range

In rare cases, the snapshot application could instead fail with:

    remote failed to apply snapshot for reason failed to apply snapshot: snapshot intersects existing range

The first message is generated by the canApplySnapshot check that occurs
when the reservation is made. The second message is generated by the
canApplySnapshot check that occurs after the snapshot is received.

The test, not unfairly, expects to only see the first message; it has
orphaned two intersecting replicas onto the receiving store, and those
intersecting replicas should fail the snapshot in the first check! It
turns out that, in rare cases, the snapshot could arrive at the
receiving store before the replicas were initialized. The snapshot would
slip through the first check, but receiving the snapshot would take some
time, and so the replicas would be initialized by the second check.

Teach the test to properly wait for the replicas on the receiving store
to be fully initialized before sending any snapshots.

Fix #29078.

Release note: None

29101: base: make TestValidateAddrs portable r=knz a=knz

On (at least) osx the port name resolution error is different. This patch
modifies the test to catch that.

Release note: None

Co-authored-by: Nikhil Benesch <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
@craig
Copy link
Contributor

craig bot commented Aug 27, 2018

Build succeeded

@craig craig bot merged commit a77edbf into cockroachdb:master Aug 27, 2018
@benesch benesch deleted the 29078 branch August 27, 2018 16:54
craig bot pushed a commit that referenced this pull request Aug 27, 2018
29126: backport-2.1: another round of merge bug fixes r=tschottdorf a=benesch

Backport:
  * 1/1 commits from "storage: fix raft snapshots that span merges and splits" (#29083)
  * 1/1 commits from "storage: deflake TestStoreRangeMergeReadoptedBothFollowers" (#29084)
  * 1/1 commits from "storage: protect ComputeChecksum commands from replaying" (#29067)
  * 1/1 commits from "storage: make ComputeChecksum a point request" (#29079)

Please see individual PRs for details.

/cc @cockroachdb/release


Co-authored-by: Nikhil Benesch <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants