Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

table scheduler is changefeed level, which could lead to imbalance in some scenarios #3654

Closed
Tracked by #3844
amyangfei opened this issue Nov 29, 2021 · 1 comment
Closed
Tracked by #3844
Labels
area/ticdc Issues or PRs related to TiCDC. component/scheduler TiCDC inner scheduler component. subject/new-feature Denotes an issue or pull request adding a new feature.

Comments

@amyangfei
Copy link
Contributor

amyangfei commented Nov 29, 2021

Is your feature request related to a problem?

  1. Setup a TiDB cluster with 3 TiCDC nodes. Create 10 tables in upstream
  2. Create 10 changefeeds, each changefeed replicates one table.

test with cdc master@pingcap/ticdc@39cfb3e

Describe the feature you'd like

The tables replication stream can be dispatched to 3 TiCDC nodes evenly.
But we will observe the tables are not even among 3 TiCDC nodes. This is caused by the table scheduler is in changefeed level, ref: https://github.com/pingcap/ticdc/blob/39cfb3eda4fee3a340a8b8dfd618195967455359/cdc/owner/changefeed.go#L76-L79
and the scheduler algorithm is based on min workload with a random map accessing. https://github.com/pingcap/ticdc/blob/39cfb3eda4fee3a340a8b8dfd618195967455359/cdc/owner/scheduler.go#L206

workloads is a map with some undeterminable accessing, but not enough.

workload ratio is 7:2:1 among three TiCDC nodes.

/tidb/cdc/task/workload/95f9f6e2-7edd-41cf-ad12-c5e91e20c624/test-cf-1
{"57":{"workload":1}}
/tidb/cdc/task/workload/ce10785f-0c81-4882-b0ee-d4b702178549/test-cf-2
{"57":{"workload":1}}
/tidb/cdc/task/workload/ce10785f-0c81-4882-b0ee-d4b702178549/test-cf-3
{"57":{"workload":1}}
/tidb/cdc/task/workload/dbc97296-ecb8-457c-9d62-9f89220635ad/test-cf-10
{"57":{"workload":1}}
/tidb/cdc/task/workload/dbc97296-ecb8-457c-9d62-9f89220635ad/test-cf-4
{"57":{"workload":1}}
/tidb/cdc/task/workload/dbc97296-ecb8-457c-9d62-9f89220635ad/test-cf-5
{"57":{"workload":1}}
/tidb/cdc/task/workload/dbc97296-ecb8-457c-9d62-9f89220635ad/test-cf-6
{"57":{"workload":1}}
/tidb/cdc/task/workload/dbc97296-ecb8-457c-9d62-9f89220635ad/test-cf-7
{"57":{"workload":1}}
/tidb/cdc/task/workload/dbc97296-ecb8-457c-9d62-9f89220635ad/test-cf-8
{"57":{"workload":1}}
/tidb/cdc/task/workload/dbc97296-ecb8-457c-9d62-9f89220635ad/test-cf-9
{"57":{"workload":1}}

Describe alternatives you've considered

In the short term, we can add more random factors to changefeed scheduler, such as
In the long term, we can add better scheduler algorithm.

Teachability, Documentation, Adoption, Migration Strategy

No response

@amyangfei amyangfei added subject/new-feature Denotes an issue or pull request adding a new feature. component/scheduler TiCDC inner scheduler component. labels Nov 29, 2021
@amyangfei amyangfei changed the title table scheduler is changefeed level, which could lead to unbalance in some scenarios table scheduler is changefeed level, which could lead to imbalance in some scenarios Nov 29, 2021
@maxshuang maxshuang added the area/ticdc Issues or PRs related to TiCDC. label Nov 29, 2021
ti-chi-bot pushed a commit that referenced this issue Jan 6, 2022
@amyangfei
Copy link
Contributor Author

The randomness optimization is added since v5.4.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ticdc Issues or PRs related to TiCDC. component/scheduler TiCDC inner scheduler component. subject/new-feature Denotes an issue or pull request adding a new feature.
Projects
None yet
Development

No branches or pull requests

2 participants