Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrying sending requests to a stuck TiKV may cost too much time #50432

Closed
MyonKeminta opened this issue Jan 15, 2024 · 0 comments · Fixed by #50506
Closed

Retrying sending requests to a stuck TiKV may cost too much time #50432

MyonKeminta opened this issue Jan 15, 2024 · 0 comments · Fixed by #50506
Labels
affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-7.6 severity/major sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug.

Comments

@MyonKeminta
Copy link
Contributor

Bug Report

In client-go, when sending request to TiKV and there is an RPC error, it can retry several times. It's mostly controlled by a hard-coded constant:

const maxReplicaAttempt = 10

It looks reasonable somehow. However, sometimes the RPC errors are thrown after being blocked for a long time, and then interrupted due to timeout. If we retry for 10 times, it can cost 10 times the timeout (ReadTimeoutShort 30s or ReadTimeoutMedium 60s). We currently suspect that this behavior causes TiDB's recovery time of service unnecessarily long when one of the TiKV node encounters problem.

@MyonKeminta MyonKeminta added type/bug The issue is confirmed as a bug. severity/moderate affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-7.6 severity/major and removed severity/moderate labels Jan 15, 2024
@ti-chi-bot ti-chi-bot bot added may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 labels Jan 16, 2024
@MyonKeminta MyonKeminta changed the title Too much retries when sending requests to a stuck TiKV Retrying sending requests to a stuck TiKV may cost too much time Jan 16, 2024
@MyonKeminta MyonKeminta added affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-5.4 This bug affects the 5.4.x(LTS) versions. and removed may-affects-6.1 may-affects-5.4 This bug maybe affects 5.4.x versions. labels Jan 16, 2024
ti-chi-bot bot pushed a commit that referenced this issue Jan 23, 2024
…g time on RPC timeout and unnecessary backoff on NotLeader errors (#50506)

close #50432
@jebter jebter added the sig/transaction SIG:Transaction label Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-5.4 This bug affects the 5.4.x(LTS) versions. affects-6.1 This bug affects the 6.1.x(LTS) versions. affects-6.5 This bug affects the 6.5.x(LTS) versions. affects-7.1 This bug affects the 7.1.x(LTS) versions. affects-7.5 This bug affects the 7.5.x(LTS) versions. affects-7.6 severity/major sig/transaction SIG:Transaction type/bug The issue is confirmed as a bug.
Projects
None yet
2 participants