-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data race in rd_kafka_broker_timeout_scan
#4625
Open
2 of 7 tasks
Labels
Comments
rd_kafka_broker_timeout_scan
/rd_avg_calc
rd_kafka_broker_timeout_scan
azat
added a commit
to azat-archive/librdkafka
that referenced
this issue
Jul 9, 2024
TSan report (founded by ClickHouse CI): Exception: Sanitizer assert found for instance �================== WARNING: ThreadSanitizer: data race (pid=1) Read of size 8 at 0x7b7800127158 by thread T987 (mutexes: read M0, write M1, write M2): #0 __tsan_memcpy <null> (clickhouse+0x74eebb7) (BuildId: 7122171f6a93acda7ea89a6d10cce3ad580a715d) confluentinc#1 rd_avg_rollover build_docker/./contrib/librdkafka/src/rdavg.h:153:22 (clickhouse+0x1e39753b) (BuildId: 7122171f6a93acda7ea89a6d10cce3ad580a715d) confluentinc#2 rd_kafka_stats_emit_avg build_docker/./contrib/librdkafka/src/rdkafka.c:1354:9 (clickhouse+0x1e39753b) confluentinc#3 rd_kafka_stats_emit_all build_docker/./contrib/librdkafka/src/rdkafka.c:1717:17 (clickhouse+0x1e395c8b) (BuildId: 7122171f6a93acda7ea89a6d10cce3ad580a715d) confluentinc#4 rd_kafka_stats_emit_tmr_cb build_docker/./contrib/librdkafka/src/rdkafka.c:1898:2 (clickhouse+0x1e395c8b) confluentinc#5 rd_kafka_timers_run build_docker/./contrib/librdkafka/src/rdkafka_timer.c:288:4 (clickhouse+0x1e46498a) (BuildId: 7122171f6a93acda7ea89a6d10cce3ad580a715d) confluentinc#6 rd_kafka_thread_main build_docker/./contrib/librdkafka/src/rdkafka.c:2021:3 (clickhouse+0x1e3919e9) (BuildId: 7122171f6a93acda7ea89a6d10cce3ad580a715d) confluentinc#7 _thrd_wrapper_function build_docker/./contrib/librdkafka/src/tinycthread.c:576:9 (clickhouse+0x1e47a57b) (BuildId: 7122171f6a93acda7ea89a6d10cce3ad580a715d) Previous write of size 8 at 0x7b7800127158 by thread T986: #0 rd_avg_calc build_docker/./contrib/librdkafka/src/rdavg.h:104:38 (clickhouse+0x1e37d71d) (BuildId: 7122171f6a93acda7ea89a6d10cce3ad580a715d) confluentinc#1 rd_kafka_broker_timeout_scan build_docker/./contrib/librdkafka/src/rdkafka_broker.c:880:25 (clickhouse+0x1e37d71d) confluentinc#2 rd_kafka_broker_ops_io_serve build_docker/./contrib/librdkafka/src/rdkafka_broker.c:3416:17 (clickhouse+0x1e37d71d) confluentinc#3 rd_kafka_broker_consumer_serve build_docker/./contrib/librdkafka/src/rdkafka_broker.c:4975:17 (clickhouse+0x1e378e5e) (BuildId: 7122171f6a93acda7ea89a6d10cce3ad580a715d) confluentinc#4 rd_kafka_broker_serve build_docker/./contrib/librdkafka/src/rdkafka_broker.c:5080:17 (clickhouse+0x1e378e5e) confluentinc#5 rd_kafka_broker_thread_main build_docker/./contrib/librdkafka/src/rdkafka_broker.c:5237:25 (clickhouse+0x1e372619) (BuildId: 7122171f6a93acda7ea89a6d10cce3ad580a715d) confluentinc#6 _thrd_wrapper_function build_docker/./contrib/librdkafka/src/tinycthread.c:576:9 (clickhouse+0x1e47a57b) (BuildId: 7122171f6a93acda7ea89a6d10cce3ad580a715d) Refs: ClickHouse/ClickHouse#60443 Refs: ClickHouse/ClickHouse#60939 Refs: confluentinc#4625
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Read the FAQ first: https://github.com/confluentinc/librdkafka/wiki/FAQ
Do NOT create issues for questions, use the discussion forum: https://github.com/confluentinc/librdkafka/discussions
Description
ClickHouse/ClickHouse#60443
How to reproduce
Run ClickHouse integration tests with TSan until it reproduces.
Checklist
Please provide the following information:
confluentinc/cp-kafka:5.2.0
<REPLACE with e.g., message.timeout.ms=123, auto.reset.offset=earliest, ..>
<REPLACE with e.g., Centos 5 (x64)>
debug=..
as necessary) from librdkafkaI think the issue is in
rd_kafka_broker_timeout_scan
accessesrkb->rkb_avg_rtt
without holding a lock forrkb
. Inrd_kafka_stats_emit_all
a read lock is taken when the stats are modified. However I am not sure how to properly fix it.The text was updated successfully, but these errors were encountered: