Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(tests): Adjust buffering tests #2862

Merged
merged 4 commits into from
Jun 24, 2020
Merged

chore(tests): Adjust buffering tests #2862

merged 4 commits into from
Jun 24, 2020

Conversation

ktff
Copy link
Contributor

@ktff ktff commented Jun 18, 2020

This PR does three things:

  • Resolves races exclusive to buffering tests.
  • Moves around thread::sleep in buffering tests, reducing waiting in total.
  • Splits crate::test_util::CountReceiver::wait method into wait and cancel. All of the usages of this method were having a race as they were expecting that the method would wait for the tcp channel to close, which wasn't the case. Instead the method was canceling the channel which caused a race.

ktff added 2 commits June 18, 2020 17:32
Signed-off-by: Kruno Tomola Fabro <[email protected]>
Signed-off-by: Kruno Tomola Fabro <[email protected]>
@ktff ktff added the domain: tests Anything related to Vector's internal tests label Jun 18, 2020
@ktff ktff requested a review from MOZGIII June 18, 2020 15:43
@ktff ktff self-assigned this Jun 18, 2020
// Give the topology some time to process the received data and simulate
// a crash.
// We need to wait for at least the source to process events.
wait_for(|| source_event_counter.load(Ordering::Acquire) == num_events);
Copy link
Contributor

@MOZGIII MOZGIII Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this means that the changes from e0761a9 don't fix the issue :(
Can we restore the comment that was there before the said commit? I think it's more on-point, and gives hints on where the root cause of the issue is. Unless it's not relevant anymore.

Copy link
Contributor Author

@ktff ktff Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It fixed races relevant to the PR, but there is another race in MockSourceConfig that we can't avoid without changing the channel impl to some that doesn't use buffering even then there would be a race in the source, but even then just source processing events isn't enough, they still need to reach the buffers of source and yet still they need to reach file to which we are buffering those events. Only some intrusive synchronizing could fix this.

Because of the first race, the assert isn't appropriate.

Copy link
Contributor

@MOZGIII MOZGIII Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah! The code before e0761a9 was:

    // A race caused by `rt.block_on(send).unwrap()` is handled here. For some
    // reason, at times less events than were sent actually arrive to the
    // `source`.
    // We mitigate that by waiting on the event counter provided by our source
    // mock.
    test_util::wait_for_atomic_usize(source_event_counter, |x| x == num_events);

It's pretty much what you said, so I suggest just restoring it as it was.
The assertion can be the same as you have in this PR, I'm mostly worried about the comment.

This isn't critical though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, to improve the overall situation, we have to reimplement the MockSourceConfig, right?

Copy link
Contributor Author

@ktff ktff Jun 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll document the race for which we need to wait. And use that test_util::wait_for_atomic_usize.

So, to improve the overall situation, we have to reimplement the MockSourceConfig, right?

I don't think its possible to get better than waiting for source_event_counter.

ktff added 2 commits June 24, 2020 16:13
Signed-off-by: Kruno Tomola Fabro <[email protected]>
Signed-off-by: Kruno Tomola Fabro <[email protected]>
@ktff
Copy link
Contributor Author

ktff commented Jun 24, 2020

New error started occurring,

panicked at 'attempt to add with overflow', src\buffers\disk\leveldb_buffer.rs:86:12

I think it's a bug that existed before this PR so I'll be merging this in spite of it.

@ktff ktff merged commit be4ecfb into master Jun 24, 2020
@ktff ktff deleted the ktff/fix_buffering_tests branch June 24, 2020 15:38
mengesb pushed a commit to jacobbraaten/vector that referenced this pull request Dec 9, 2020
* Fix buffering tests

Signed-off-by: Kruno Tomola Fabro <[email protected]>

* Split wait

Signed-off-by: Kruno Tomola Fabro <[email protected]>

* Document the race

Signed-off-by: Kruno Tomola Fabro <[email protected]>

* Remove unneded

Signed-off-by: Kruno Tomola Fabro <[email protected]>
Signed-off-by: Brian Menges <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: tests Anything related to Vector's internal tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants