Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeedccl: ExportRequest poller should poll ranges instead of all spans #28660

Closed
danhhz opened this issue Aug 15, 2018 · 2 comments
Closed
Assignees
Labels
A-cdc Change Data Capture C-performance Perf of queries or internals. Solution not expected to change functional behavior.

Comments

@danhhz
Copy link
Contributor

danhhz commented Aug 15, 2018

Currently, it picks a new high-water timestamp, then polls all tracked spans between the old high-water and the new one. In the cdc/w=1000/nodes=3/init=false roachtest, we usually see each poll finish in tens of seconds, but occasionally see one or two that take 4 minutes or more. Then the next poll has even more data to export than usual, which seems to increase the likelihood of hitting a slow request again. This is currently the biggest stability issue of the tpcc-1000 test.

Now that span-frontier is plumbed and we have span-level resolved timestamps, we can use it to rework the poller. For example, every range gets a request put in a queue. A bounded set of workers pull a request from the queue, wait for it to come back, put the data and resolved timestamp in the buffer, and then re-enqueue the same span with the next set of timestamps. There could be some coordination for what the next highwater is between all the requests, but it's easier and end-to-end latencies will be lower if there isn't (incidentally, the latter also might help find some bugs that will be triggered when we switch to RangeFeed).

@danhhz danhhz added C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-cdc Change Data Capture labels Aug 15, 2018
@danhhz danhhz added this to the 2.1 milestone Aug 15, 2018
@danhhz danhhz self-assigned this Aug 15, 2018
@petermattis petermattis removed this from the 2.1 milestone Oct 5, 2018
@danhhz
Copy link
Contributor Author

danhhz commented Nov 1, 2018

I thought we'd need this to rein in the long tail, but it's possible that #32060 was actually the issue I was seeing in the roachtests. It's a pretty big change to how poller works, so I'm going to optimistically move this out of 2.1.x, but with the caveat that additional testing may pull it back in.

@danhhz
Copy link
Contributor Author

danhhz commented Jan 28, 2019

This was a speculative idea and we still haven't seen any evidence from our testing that's it's needed. Plus RangeFeed obsoletes it. Closing

@danhhz danhhz closed this as completed Jan 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cdc Change Data Capture C-performance Perf of queries or internals. Solution not expected to change functional behavior.
Projects
None yet
Development

No branches or pull requests

2 participants