Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changefeedccl: Improve distsql planning confidence #113968

Closed
miretskiy opened this issue Nov 7, 2023 · 4 comments
Closed

changefeedccl: Improve distsql planning confidence #113968

miretskiy opened this issue Nov 7, 2023 · 4 comments
Labels
A-cdc Change Data Capture C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-cdc
Milestone

Comments

@miretskiy
Copy link
Contributor

miretskiy commented Nov 7, 2023

We already use jobs profiler in changefeed jobs. We need to make sure that the
distsql plan persisted into the jobs table is useful for changefeeds and that we persist
it whenever we replan.

Looks like the summary we produce includes all spans assigned to aggregators -- that's good and bad. Good because we can see exactly which spans are assigned where; bad... because the "url" for large scale changefeed is multi megabyte blob of data written to job info table. See flow_diagram.go.

In addition, we lack any observability into distsql planning -- it appears that it might
be possible to produce really skewed assignments as was observed in customer escalation
(50k ranges on a single node) as well as in test failures #113966

See https://github.com/cockroachdb/cockroach/blob/release-23.1/pkg/sql/distsql_physical_planner.go#L1613
The code is there that seem to make it possible to force planning on a local node. We need better unit tests, and
we must make sure the above (single node assignments) do not happen.

Jira issue: CRDB-33276

@miretskiy miretskiy added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-cdc Change Data Capture T-cdc labels Nov 7, 2023
Copy link

blathers-crl bot commented Nov 7, 2023

cc @cockroachdb/cdc

@miretskiy
Copy link
Contributor Author

Looks like the summary we produce includes all spans assigned to aggregators -- that's good and bad. Good because we can see exactly which spans are assigned where; bad... because the "url" for large scale changefeed is multi megabyte blob of data written to job info table. See flow_diagram.go.

@miretskiy miretskiy changed the title changefeedccl: Improve job and planning observability changefeedccl: Improve distsql planning confidence Nov 8, 2023
@miretskiy miretskiy added this to the 24.1 milestone Nov 8, 2023
@miretskiy
Copy link
Contributor Author

Closing in favor of #114079 and #114528

@jayshrivastava
Copy link
Contributor

Closing based on the above comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cdc Change Data Capture C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-cdc
Projects
None yet
Development

No branches or pull requests

3 participants