Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

puller may got stuck #1067

Closed
lidezhu opened this issue Mar 6, 2025 · 0 comments · Fixed by #1066
Closed

puller may got stuck #1067

lidezhu opened this issue Mar 6, 2025 · 0 comments · Fixed by #1066
Assignees
Labels
severity/moderate type/bug The issue is confirmed as a bug.

Comments

@lidezhu
Copy link
Collaborator

lidezhu commented Mar 6, 2025

What did you do?

In tikv side, to distinguish connections from different versions of cdc, it maintains a feature list for a every connection.
And tikv use the first request from each connection to set the feature list of the connection.
https://github.com/tikv/tikv/blob/a34740fefaf69092d14f6af5160e8e5ff1c507f8/components/cdc/src/service.rs#L450

If the connection doesn't enable FeatureGate::BATCH_RESOLVED_TS, it won't get any resolved ts message.
https://github.com/tikv/tikv/blob/a34740fefaf69092d14f6af5160e8e5ff1c507f8/components/cdc/src/endpoint.rs#L443

FeatureGate::BATCH_RESOLVED_TS is enabled when the cdc version in the request header is larger than 4.0.8.

But in cdc side, the deregister request's header doesn't have cdc version information. So if the first request of a connection is a deregister request, the connection can never get any resolved ts message.

This problem happens when dispatcher register and deregister happens in a very short time.
Detail steps:

  1. A request worker receive a region for sending;
  2. Before send, the request worker finds that the region of the table is stopped, so it discards the region;
  3. The request worker receives an unregister signal of the table, it sends the deregister request to tikv and tikv disable FeatureGate::BATCH_RESOLVED_TS of the connection, so this connection will never receive any resolved ts.

What did you expect to see?

No response

What did you see instead?

Nan.

Versions of the cluster

Upstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

(paste TiDB cluster version here)

Upstream TiKV version (execute tikv-server --version):

(paste TiKV version here)

TiCDC version (execute cdc version):

(paste TiCDC version here)
@lidezhu lidezhu added the type/bug The issue is confirmed as a bug. label Mar 6, 2025
@lidezhu lidezhu self-assigned this Mar 6, 2025
@ti-chi-bot ti-chi-bot bot closed this as completed in #1066 Mar 6, 2025
@ti-chi-bot ti-chi-bot bot closed this as completed in cbbc4fe Mar 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/moderate type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant