changefeedccl: ExportRequest poller should poll ranges instead of all spans #28660
Labels
A-cdc
Change Data Capture
C-performance
Perf of queries or internals. Solution not expected to change functional behavior.
Currently, it picks a new high-water timestamp, then polls all tracked spans between the old high-water and the new one. In the
cdc/w=1000/nodes=3/init=false
roachtest, we usually see each poll finish in tens of seconds, but occasionally see one or two that take 4 minutes or more. Then the next poll has even more data to export than usual, which seems to increase the likelihood of hitting a slow request again. This is currently the biggest stability issue of the tpcc-1000 test.Now that span-frontier is plumbed and we have span-level resolved timestamps, we can use it to rework the poller. For example, every range gets a request put in a queue. A bounded set of workers pull a request from the queue, wait for it to come back, put the data and resolved timestamp in the buffer, and then re-enqueue the same span with the next set of timestamps. There could be some coordination for what the next highwater is between all the requests, but it's easier and end-to-end latencies will be lower if there isn't (incidentally, the latter also might help find some bugs that will be triggered when we switch to RangeFeed).
The text was updated successfully, but these errors were encountered: