Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exponential backoff not honored by fast leader query #3166

Closed
3 of 7 tasks
ajbarb opened this issue Dec 4, 2020 · 1 comment
Closed
3 of 7 tasks

Exponential backoff not honored by fast leader query #3166

ajbarb opened this issue Dec 4, 2020 · 1 comment
Milestone

Comments

@ajbarb
Copy link
Contributor

ajbarb commented Dec 4, 2020

Read the FAQ first: https://github.com/edenhill/librdkafka/wiki/FAQ

Description

Librdkafka doesn't enforce exponential back-off during fast leader query. At max the backoff is 2*topic.metadata.refresh.fast.interval.ms. The issue seems to be in the rd_kafka_metadata_leader_query_tmr_cb where the backoff value passed is rtmr_interval and rd_kafka_timer_backoff calls the rd_kafka_timer_schedule which updates the rtmr_next to rtmr_interval + rtmr_interval.

Shouldn't the new rtmr_interval get doubled every time we try to backoff in rd_kafka_metadata_leader_query_tmr_cb? Something like this:

                rtmr->rtmr_interval = rtmr->rtmr_interval * 2;
                rd_kafka_timer_backoff(rkts, rtmr,
                    (int)rtmr->rtmr_interval);

How to reproduce

  • Fail the rd_kafka_fetch_reply_handle with error RD_KAFKA_RESP_ERR_NOT_LEADER_FOR_PARTITION continously
  • Run the rdkafka_example with:
    conf->set("topic.metadata.refresh.fast.interval.ms", "2000", errstr);
    conf->set("topic.metadata.refresh.interval.ms", "32000", errstr);
  • Notice that the max delay between fast fetch request is 4sec.

IMPORTANT: Always try to reproduce the issue on the latest released version (see https://github.com/edenhill/librdkafka/releases), if it can't be reproduced on the latest version the issue has been fixed.

Checklist

IMPORTANT: We will close issues where the checklist has not been completed.

Please provide the following information:

  • librdkafka version (release number or git tag): 1.5.0
  • Apache Kafka version: <REPLACE with e.g., 0.10.2.3>
  • librdkafka client configuration: topic.metadata.refresh.fast.interval.ms=2000, topic.metadata.refresh.interval.ms=32000
  • Operating system: Windows 10
  • Provide logs (with debug=.. as necessary) from librdkafka
  • Provide broker log excerpts
  • Critical issue
@ajbarb
Copy link
Contributor Author

ajbarb commented Dec 7, 2020

@edenhill, is my understanding correct?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants