changefeedccl: Improve distsql planning confidence #113968
Labels
A-cdc
Change Data Capture
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-cdc
Milestone
We already use jobs profiler in changefeed jobs. We need to make sure that the
distsql plan persisted into the jobs table is useful for changefeeds and that we persist
it whenever we replan.
Looks like the summary we produce includes all spans assigned to aggregators -- that's good and bad. Good because we can see exactly which spans are assigned where; bad... because the "url" for large scale changefeed is multi megabyte blob of data written to job info table. See flow_diagram.go.
In addition, we lack any observability into distsql planning -- it appears that it might
be possible to produce really skewed assignments as was observed in customer escalation
(50k ranges on a single node) as well as in test failures #113966
See https://github.com/cockroachdb/cockroach/blob/release-23.1/pkg/sql/distsql_physical_planner.go#L1613
The code is there that seem to make it possible to force planning on a local node. We need better unit tests, and
we must make sure the above (single node assignments) do not happen.
Jira issue: CRDB-33276
The text was updated successfully, but these errors were encountered: