changefeedccl: flush less #32060

danhhz · 2018-10-31T19:16:47Z

For correctness, changeAggreagator needs to flush kafka before
forwarding a span-level resolved timestamp to changeFrontier. Before
this change, it was flushing once before every span-level resolved
timestamp. On TPCC-1000, there were about 2000 of these per second,
which meant we were spending more time flushing kafka than writing to
it. (This is especially bad because it blocks Next calls.)

This change batches span-level resolved timestamps for some amount of
time, before flushing once and emitting them all. This adds ~20% delay
to the changefeed-level timestamp from being emitted to the user
slightly, but we now spend almost no time flushing.

Along with #32023, optimistically:
Closes #31001

Release note (bug fix): CHANGEFEEDs now spend dramatically less time
flushing kafka writes

For correctness, changeAggreagator needs to flush kafka before forwarding a span-level resolved timestamp to changeFrontier. Before this change, it was flushing once before every span-level resolved timestamp. On TPCC-1000, there were about 2000 of these per second, which meant we were spending more time flushing kafka than writing to it. (This is especially bad because it blocks Next calls.) This change batches span-level resolved timestamps for some amount of time, before flushing once and emitting them all. This adds ~20% delay to the changefeed-level timestamp from being emitted to the user slightly, but we now spend almost no time flushing. Along with cockroachdb#32023, optimistically: Closes cockroachdb#31001 Release note (bug fix): CHANGEFEEDs now spend dramatically less time flushing kafka writes

cockroach-teamcity · 2018-10-31T19:16:55Z

This change is

danhhz · 2018-10-31T22:05:37Z

TFTR

bors r=mrtracy

craig · 2018-10-31T22:05:39Z

👎 Rejected by PR status

danhhz · 2018-10-31T22:10:30Z

bors r=mrtracy

32060: changefeedccl: flush less r=mrtracy a=danhhz For correctness, changeAggreagator needs to flush kafka before forwarding a span-level resolved timestamp to changeFrontier. Before this change, it was flushing once before every span-level resolved timestamp. On TPCC-1000, there were about 2000 of these per second, which meant we were spending more time flushing kafka than writing to it. (This is especially bad because it blocks Next calls.) This change batches span-level resolved timestamps for some amount of time, before flushing once and emitting them all. This adds ~20% delay to the changefeed-level timestamp from being emitted to the user slightly, but we now spend almost no time flushing. Along with #32023, optimistically: Closes #31001 Release note (bug fix): CHANGEFEEDs now spend dramatically less time flushing kafka writes Co-authored-by: Daniel Harrison <[email protected]>

craig · 2018-10-31T22:27:02Z

Build succeeded

GitHub CI (Cockroach)

danhhz requested review from mrtracy and a team October 31, 2018 19:16

mrtracy approved these changes Oct 31, 2018

View reviewed changes

craig bot merged commit 04649a1 into cockroachdb:master Oct 31, 2018

danhhz mentioned this pull request Nov 1, 2018

changefeedccl: ExportRequest poller should poll ranges instead of all spans #28660

Closed

danhhz mentioned this pull request Nov 12, 2018

release-2.1: various changefeedccl fixes #32235

Merged

danhhz deleted the cdc_batch_flushes branch November 20, 2018 18:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

changefeedccl: flush less #32060

changefeedccl: flush less #32060

danhhz commented Oct 31, 2018

cockroach-teamcity commented Oct 31, 2018

danhhz commented Oct 31, 2018

craig bot commented Oct 31, 2018

danhhz commented Oct 31, 2018

craig bot commented Oct 31, 2018

changefeedccl: flush less #32060

changefeedccl: flush less #32060

Conversation

danhhz commented Oct 31, 2018

cockroach-teamcity commented Oct 31, 2018

danhhz commented Oct 31, 2018

craig bot commented Oct 31, 2018

danhhz commented Oct 31, 2018

craig bot commented Oct 31, 2018

Build succeeded