Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeedccl: flush less #32060

Merged
merged 1 commit into from
Oct 31, 2018
Merged

Conversation

danhhz
Copy link
Contributor

@danhhz danhhz commented Oct 31, 2018

For correctness, changeAggreagator needs to flush kafka before
forwarding a span-level resolved timestamp to changeFrontier. Before
this change, it was flushing once before every span-level resolved
timestamp. On TPCC-1000, there were about 2000 of these per second,
which meant we were spending more time flushing kafka than writing to
it. (This is especially bad because it blocks Next calls.)

This change batches span-level resolved timestamps for some amount of
time, before flushing once and emitting them all. This adds ~20% delay
to the changefeed-level timestamp from being emitted to the user
slightly, but we now spend almost no time flushing.

Along with #32023, optimistically:
Closes #31001

Release note (bug fix): CHANGEFEEDs now spend dramatically less time
flushing kafka writes

For correctness, changeAggreagator needs to flush kafka before
forwarding a span-level resolved timestamp to changeFrontier. Before
this change, it was flushing once before every span-level resolved
timestamp. On TPCC-1000, there were about 2000 of these per second,
which meant we were spending more time flushing kafka than writing to
it. (This is especially bad because it blocks Next calls.)

This change batches span-level resolved timestamps for some amount of
time, before flushing once and emitting them all. This adds ~20% delay
to the changefeed-level timestamp from being emitted to the user
slightly, but we now spend almost no time flushing.

Along with cockroachdb#32023, optimistically:
Closes cockroachdb#31001

Release note (bug fix): CHANGEFEEDs now spend dramatically less time
flushing kafka writes
@danhhz danhhz requested review from mrtracy and a team October 31, 2018 19:16
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@danhhz
Copy link
Contributor Author

danhhz commented Oct 31, 2018

TFTR

bors r=mrtracy

@craig
Copy link
Contributor

craig bot commented Oct 31, 2018

👎 Rejected by PR status

@danhhz
Copy link
Contributor Author

danhhz commented Oct 31, 2018

bors r=mrtracy

craig bot pushed a commit that referenced this pull request Oct 31, 2018
32060: changefeedccl: flush less r=mrtracy a=danhhz

For correctness, changeAggreagator needs to flush kafka before
forwarding a span-level resolved timestamp to changeFrontier. Before
this change, it was flushing once before every span-level resolved
timestamp. On TPCC-1000, there were about 2000 of these per second,
which meant we were spending more time flushing kafka than writing to
it. (This is especially bad because it blocks Next calls.)

This change batches span-level resolved timestamps for some amount of
time, before flushing once and emitting them all. This adds ~20% delay
to the changefeed-level timestamp from being emitted to the user
slightly, but we now spend almost no time flushing.

Along with #32023, optimistically:
Closes #31001

Release note (bug fix): CHANGEFEEDs now spend dramatically less time
flushing kafka writes

Co-authored-by: Daniel Harrison <[email protected]>
@craig
Copy link
Contributor

craig bot commented Oct 31, 2018

Build succeeded

@craig craig bot merged commit 04649a1 into cockroachdb:master Oct 31, 2018
@danhhz danhhz deleted the cdc_batch_flushes branch November 20, 2018 18:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roachtest: cdc/w=1000/nodes=3/init=false failed
3 participants