Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in TestPartitionReader_ConsumeAtStartup #8691

Closed
pracucci opened this issue Jul 12, 2024 · 7 comments
Closed

Race condition in TestPartitionReader_ConsumeAtStartup #8691

pracucci opened this issue Jul 12, 2024 · 7 comments

Comments

@pracucci
Copy link
Collaborator

I've just seen a race condition reported by TestPartitionReader_ConsumeAtStartup (CI):

==================
WARNING: DATA RACE
Write at 0x00c0007f8260 by goroutine 11360:
  github.com/twmb/franz-go/pkg/kgo.(*cursor).setOffset()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:208 +0x589
  github.com/twmb/franz-go/pkg/kgo.(*cursor).unset()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:183 +0x546
  github.com/twmb/franz-go/pkg/kgo.(*consumer).assignPartitions()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1016 +0x53b
  github.com/twmb/franz-go/pkg/kgo.(*Client).close()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/client.go:1009 +0x264
  github.com/twmb/franz-go/pkg/kgo.(*Client).Close()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/client.go:993 +0xe4
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).stopDependencies()
      /__w/mimir/mimir/pkg/storage/ingest/reader.go:167 +0x9f
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).stop()
      /__w/mimir/mimir/pkg/storage/ingest/reader.go:156 +0x1e7
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).stop-fm()
      <autogenerated>:1 +0x47
  github.com/grafana/dskit/services.(*BasicService).main()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:207 +0x558
  github.com/grafana/dskit/services.(*BasicService).StartAsync.func1.gowrap1()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:119 +0x33

Previous read at 0x00c0007f[82](https://github.com/grafana/mimir/actions/runs/9895618542/job/27335957523?pr=8688#step:8:83)60 by goroutine 12832:
  github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp.func2()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1090 +0x124
  github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1121 +0x14b3
  github.com/twmb/franz-go/pkg/kgo.(*source).fetch.func5()
            logging.go:33: level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{2 e-1 ce0}]
            logging.go:33: level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2
            logging.go:33: level info partition 1 component kafka_client msg immediate metadata update triggered why fetch had inner topic errors from broker 0: OFFSET_OUT_OF_RANGE{test[1]}
            logging.go:33: level info partition 1 target_lag 1s max_lag 1s msg partition reader found no records to consume because partition is empty partition_start_offset 2 last_produced_offset 1
            logging.go:33: level info partition 1 target_lag 1s max_lag 1s msg partition reader consumed partition and current lag is lower than configured target consumer lag last_consumed_offset 1 current_lag 0s
            logging.go:33: level info partition 1 msg stopping partition reader
            testing.go:13[98](https://github.com/grafana/mimir/actions/runs/9895618542/job/27335957523?pr=8688#step:8:99): race detected during execution of test
FAIL
FAIL	github.com/grafana/mimir/pkg/storage/ingest	52.182s
@zenador
Copy link
Contributor

zenador commented Aug 23, 2024

Another instance

@narqo
Copy link
Contributor

narqo commented Oct 2, 2024

Although it's not a race, I'll put it here, since it's from the same one. In this build we bumped into a flake in the should_not_wait_indefinitely_if_context_is_cancelled_while_fetching_last_produced_offset:

FAIL: TestPartitionReader_ConsumeAtStartup
    --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_context_is_cancelled_while_fetching_last_produced_offset (5.10s)
        logger.go:38: 2024-10-02 09:04:02.06682031 +0000 UTC m=+33.348546147 level info partition 1 msg starting consumption from partition start because no offset has been found start_offset -2 consumer_group test-group
        logger.go:38: 2024-10-02 09:04:02.068525838 +0000 UTC m=+33.350251676 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization
        logger.go:38: 2024-10-02 09:04:02.069234555 +0000 UTC m=+33.350960393 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{-2 e-1 ce0}]
        logger.go:38: 2024-10-02 09:04:02.069778519 +0000 UTC m=+33.351504347 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        logger.go:38: 2024-10-02 09:04:02.276416062 +0000 UTC m=+33.558141910 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        logger.go:38: 2024-10-02 09:04:02.332598099 +0000 UTC m=+33.614323927 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        logger.go:38: 2024-10-02 09:04:02.870250129 +0000 UTC m=+34.151975957 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        logger.go:38: 2024-10-02 09:04:02.915102153 +0000 UTC m=+34.196827991 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        logger.go:38: 2024-10-02 09:04:03.071172946 +0000 UTC m=+34.352898804 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        logger.go:38: 2024-10-02 09:04:03.76303884 +0000 UTC m=+35.044764678 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        logger.go:38: 2024-10-02 09:04:03.894623102 +0000 UTC m=+35.176348940 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        logger.go:38: 2024-10-02 09:04:04.072419974 +0000 UTC m=+35.354145822 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        logger.go:38: 2024-10-02 09:04:05.074261759 +0000 UTC m=+36.355987587 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        logger.go:38: 2024-10-02 09:04:05.805081433 +0000 UTC m=+37.086807271 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        logger.go:38: 2024-10-02 09:04:05.907466707 +0000 UTC m=+37.189192545 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        logger.go:38: 2024-10-02 09:04:06.07564007 +0000 UTC m=+37.357365909 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        logger.go:38: 2024-10-02 09:04:07.077075325 +0000 UTC m=+38.358801163 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:39977 broker 0 err EOF
        reader_test.go:1318: expected Failed, got Starting
    logger.go:38: 2024-10-02 09:04:08.078455501 +0000 UTC m=+39.360181339 level warn partition 1 component kafka_client msg unable to open connection to broker addr 127.0.0.1:39977 broker 0 err dial tcp 127.0.0.1:39977: connect: connection refused
    logger.go:38: 2024-10-02 09:04:08.305758644 +0000 UTC m=+39.587484482 level warn partition 1 component kafka_client msg unable to open connection to broker addr 127.0.0.1:39977 broker 0 err dial tcp 127.0.0.1:39977: connect: connection refused
    logger.go:38: 2024-10-02 09:04:08.305915884 +0000 UTC m=+39.587641712 level warn partition 1 msg failed to fetch the last produced offset err unable to dial: dial tcp 127.0.0.1:39977: connect: connection refused
    logger.go:38: 2024-10-02 09:04:08.408844219 +0000 UTC m=+39.690570067 level warn partition 1 component kafka_client msg unable to open connection to broker addr 127.0.0.1:39977 broker 0 err dial tcp 127.0.0.1:39977: connect: connection refused
    logger.go:38: 2024-10-02 09:04:08.409005267 +0000 UTC m=+39.690731125 level warn partition 1 msg failed to fetch the last produced offset err unable to dial: dial tcp 127.0.0.1:39977: connect: connection refused
FAIL
FAIL	github.com/grafana/mimir/pkg/storage/ingest	83.281s

@codesome
Copy link
Member

codesome commented Oct 5, 2024

Another occurrence https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515

FAIL: TestPartitionReader_ConsumeAtStartup
==================
WARNING: DATA RACE
Write at 0x00c00091f160 by goroutine 44209:
  github.com/twmb/franz-go/pkg/kgo.(*cursor).setOffset()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:284 +0x7a8
  github.com/twmb/franz-go/pkg/kgo.(*cursor).unset()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:259 +0x765
  github.com/twmb/franz-go/pkg/kgo.(*consumer).assignPartitions()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1025 +0x757
  github.com/twmb/franz-go/pkg/kgo.(*Client).RemoveConsumePartitions()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:874 +0x544
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).switchToOngoingFetcher()
      /__w/mimir/mimir/pkg/storage/ingest/reader.go:280 +0x3a4
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).run()
      /__w/mimir/mimir/pkg/storage/ingest/reader.go:236 +0x3e
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).run-fm()
      <autogenerated>:1 +0x47
  github.com/grafana/dskit/services.(*BasicService).main()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:193 +0x3b7
  github.com/grafana/dskit/services.(*BasicService).StartAsync.func1.gowrap1()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x33

Previous read at 0x00c00091f160 by goroutine 44587:
  github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp.func2()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1229 +0x124
  github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1260 +0x1501
  github.com/twmb/franz-go/pkg/kgo.(*source).fetch.func5()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:934 +0x12c

Goroutine 44209 (running) created at:
  github.com/grafana/dskit/services.(*BasicService).StartAsync.func1()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x1dc
  github.com/grafana/dskit/services.(*BasicService).switchState()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:142 +0x115
  github.com/grafana/dskit/services.(*BasicService).StartAsync()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:119 +0xb1
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).StartAsync()
      <autogenerated>:1 +0x63
  github.com/grafana/dskit/services.StartAndAwaitRunning()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/services.go:104 +0x49
  github.com/grafana/mimir/pkg/storage/ingest.TestPartitionReader_ConsumeAtStartup.func13.1()
      /__w/mimir/mimir/pkg/storage/ingest/reader_test.go:1425 +0x1018
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1690 +0x226
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1743 +0x44

Goroutine 44587 (running) created at:
  github.com/twmb/franz-go/pkg/kgo.(*source).fetch()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:932 +0xc04
  github.com/twmb/franz-go/pkg/kgo.(*source).loopFetch()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:787 +0x4c6
  github.com/twmb/franz-go/pkg/kgo.(*source).maybeConsume.gowrap1()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:[71](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:72)7 +0x33
==================
==================
WARNING: DATA RACE
Write at 0x00c00091f168 by goroutine 44625:
  github.com/twmb/franz-go/pkg/kgo.(*cursor).setOffset()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:284 +0x133d
  github.com/twmb/franz-go/pkg/kgo.(*consumer).assignPartitions()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1164 +0x12f0
  github.com/twmb/franz-go/pkg/kgo.(*consumer).doOnMetadataUpdate.func1()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1208 +0x173
  github.com/twmb/franz-go/pkg/kgo.(*consumer).doOnMetadataUpdate.func2()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1220 +0x3a

Previous read at 0x00c00091f168 by goroutine 44587:
  github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp.func2()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1229 +0x124
  github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1260 +0x1501
  github.com/twmb/franz-go/pkg/kgo.(*source).fetch.func5()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:934 +0x12c

Goroutine 44625 (running) created at:
  github.com/twmb/franz-go/pkg/kgo.(*consumer).doOnMetadataUpdate()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1217 +0x187
  github.com/twmb/franz-go/pkg/kgo.(*Client).updateMetadataLoop()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/metadata.go:269 +0x10b4
  github.com/twmb/franz-go/pkg/kgo.NewClient.gowrap1()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/client.go:516 +0x33

Goroutine 44587 (running) created at:
  github.com/twmb/franz-go/pkg/kgo.(*source).fetch()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:932 +0xc04
  github.com/twmb/franz-go/pkg/kgo.(*source).loopFetch()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:787 +0x4c6
  github.com/twmb/franz-go/pkg/kgo.(*source).maybeConsume.gowrap1()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:717 +0x33
==================
--- FAIL: TestPartitionReader_ConsumeAtStartup (0.00s)
    --- FAIL: TestPartitionReader_ConsumeAtStartup/should_consume_partition_from_last_committed_offset_if_position=last-offset,_and_wait_until_max_lag_is_honored_if_can't_honor_target_lag (6.25s)
        logging.go:33: level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time
        logging.go:33: level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test
        logging.go:33: level info component kafka_client msg done waiting for metadata for new topic topic test
        logging.go:33: level info component kafka_client msg initializing producer id
        logging.go:33: level info component kafka_client msg producer id initialization success id 8055687386281055342 epoch 0
        reader_test.go:1243: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        testing.go:1399: race detected during execution of test
    --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_context_is_cancelled_while_fetching_last_produced_offset (9.40s)
        logger.go:38: 2024-10-04 22:49:51.676898088 +0000 UTC m=+34.847599356 level info partition 1 msg starting consumption from partition start because no offset has been found start_offset -2 consumer_group test-group
        logger.go:38: 2024-10-04 22:49:51.678366846 +0000 UTC m=+34.849068104 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization
        logger.go:38: 2024-10-04 22:49:51.67915597 +0000 UTC m=+34.84985[72](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:73)38 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{-2 e-1 ce0}]
        logger.go:38: 2024-10-04 22:49:51.679718225 +0000 UTC m=+34.850419483 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:51.911434318 +0000 UTC m=+35.082135576 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:51.951836444 +0000 UTC m=+35.122537712 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:52.377158206 +0000 UTC m=+35.547859474 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:52.510548242 +0000 UTC m=+35.681249510 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:52.68063619 +0000 UTC m=+35.851337458 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:53.486198117 +0000 UTC m=+36.656899375 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:53.555875287 +0000 UTC m=+36.726576555 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:53.681898234 +0000 UTC m=+36.852599492 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:54.68321097 +0000 UTC m=+37.853912238 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:55.596257648 +0000 UTC m=+38.766958917 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:55.684383801 +0000 UTC m=+38.855085070 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:55.959703512 +0000 UTC m=+39.130404781 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:56.685775563 +0000 UTC m=+39.856476832 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:57.687191198 +0000 UTC m=+40.857892467 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:58.097950501 +0000 UTC m=+41.268651769 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:58.461378582 +0000 UTC m=+41.632079850 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:58.690084891 +0000 UTC m=+41.860786159 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:49:59.691514703 +0000 UTC m=+42.862215971 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:50:00.599023338 +0000 UTC m=+43.769724607 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:50:00.60165153 +0000 UTC m=+43.772352798 level warn partition 1 msg failed to fetch the last produced offset err broker closed the connection immediately after a request was issued, which happens when SASL is required but not provided: is SASL missing?
        logger.go:38: 2024-10-04 22:50:00.693131098 +0000 UTC m=+43.863832366 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:50:00.962362893 +0000 UTC m=+44.133064161 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:35227 broker 0 err EOF
        logger.go:38: 2024-10-04 22:50:00.962595951 +0000 UTC m=+44.133297229 level warn partition 1 msg failed to fetch the last produced offset err broker closed the connection immediately after a request was issued, which happens when SASL is required but not provided: is SASL missing?
        testing.go:1399: race detected during execution of test
    --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_there_are_no_records_to_consume_from_Kafka_but_partition_start_offset_is_>_0_(e.g._all_previous_records_have_been_deleted_by_Kafka_retention) (0.00s)
        logger.go:38: 2024-10-04 22:49:56.198617457 +0000 UTC m=+39.369318725 partition 1 fetcher 1 caller log.go:168 level debug msg fetched records duration 6.633659ms start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 0 lso 0 err <nil>
        logger.go:38: 2024-10-04 22:49:56.205584681 +0000 UTC m=+39.376285939 partition 1 fetcher 1 caller log.go:168 level debug msg merged fetch result with the next result
        --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_there_are_no_records_to_consume_from_Kafka_but_partition_start_offset_is_>_0_(e.g._all_previous_records_have_been_deleted_by_Kafka_retention)/consume_from_position:_timestamp (1.03s)
            logging.go:33: level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time
            logging.go:33: level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test
            logging.go:33: level info component kafka_client msg initializing producer id
            logging.go:33: level info component kafka_client msg done waiting for metadata for new topic topic test
            logging.go:33: level info component kafka_client msg producer id initialization success id [73](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:74)29056796108552702 epoch 0
            reader_test.go:1398: produced 2 records
            reader_test.go:1407: fetched partition end offset: 2
            reader_test.go:1415: advanced partition start offset to: 2
            logger.go:38: 2024-10-04 22:49:56.210704378 +0000 UTC m=+39.381405646 level info partition 1 msg starting consumption from timestamp timestamp 1728082196204 last_consumed_offset 1 start_offset 2 consumer_group test-group
            logger.go:38: 2024-10-04 22:49:56.212379091 +0000 UTC m=+39.383080359 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization
            logger.go:38: 2024-10-04 22:49:56.213781242 +0000 UTC m=+39.384482510 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{2 e-1 ce0}]
            logger.go:38: 2024-10-04 22:49:57.21[74](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:75)91014 +0000 UTC m=+40.388192292 level info partition 1 target_lag 1s max_lag 1s msg partition reader is starting to consume partition until target and max consumer lag is honored
            logger.go:38: 2024-10-04 22:49:57.21771803 +0000 UTC m=+40.388419298 level info partition 1 msg starting concurrent fetchers start_offset 2 concurrency 2 recordsPerFetch 2
            logger.go:38: 2024-10-04 22:49:57.220324061 +0000 UTC m=+40.391025329 level info partition 1 target_lag 1s max_lag 1s msg partition reader found no records to consume because partition is empty partition_start_offset 2 last_produced_offset 1
            logger.go:38: 2024-10-04 22:49:57.22038782 +0000 UTC m=+40.391089088 level info partition 1 target_lag 1s max_lag 1s msg partition reader consumed partition and current lag is lower than configured target consumer lag last_consumed_offset 1 current_lag 0s
            logger.go:38: 2024-10-04 22:49:57.220499199 +0000 UTC m=+40.391200457 level info partition 1 component kafka_client msg assigning partitions why remove of map[test:[1]] requested how unassigning any currently assigned matching partition that is in the input input test[1{0 e0 ce0}]
            logger.go:38: 2024-10-04 22:49:57.222397652 +0000 UTC m=+40.393098940 level info partition 1 component kafka_client msg immediate metadata update triggered why from AddConsumePartitions
            logger.go:38: 2024-10-04 22:49:57.223239291 +0000 UTC m=+40.393940559 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{2 e-1 ce0}]
            logger.go:38: 2024-10-04 22:49:57.215802466 +0000 UTC m=+40.386503734 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2
            logger.go:38: 2024-10-04 22:49:57.228099361 +0000 UTC m=+40.398800630 partition 1 fetcher 1 caller log.go:168 level debug msg fetched records duration 10.127435ms start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 0 lso 0 err <nil>
            logger.go:38: 2024-10-04 22:49:57.22821041 +0000 UTC m=+40.398911678 level info partition 1 msg stopping partition reader
            logger.go:38: 2024-10-04 22:49:57.228342077 +0000 UTC m=+40.399043345 partition 1 fetcher 0 caller log.go:168 level debug msg fetched records duration 9.77317ms start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 0 lso 0 err <nil>
            testing.go:1399: race detected during execution of test
        --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_there_are_no_records_to_consume_from_Kafka_but_partition_start_offset_is_>_0_(e.g._all_previous_records_have_been_deleted_by_Kafka_retention)/consume_from_position:_end (5.03s)
            logging.go:33: level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test
            logging.go:33: level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time
            logging.go:33: level info component kafka_client msg done waiting for metadata for new topic topic test
            logging.go:33: level info component kafka_client msg initializing producer id
            logging.go:33: level info component kafka_client msg producer id initialization success id 1047195296243455181 epoch 0
            reader_test.go:1398: produced 2 records
            reader_test.go:1407: fetched partition end offset: 2
            reader_test.go:1415: advanced partition start offset to: 2
            logger.go:38: 2024-10-04 22:49:54.458762669 +0000 UTC m=+37.629463927 level info partition 1 msg starting consumption from timestamp timestamp 1728082194452 last_consumed_offset 1 start_offset 2 consumer_group test-group
            logger.go:38: 2024-10-04 22:49:54.46041991 +0000 UTC m=+37.631121168 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization
            logger.go:38: 2024-10-04 22:49:54.462035522 +0000 UTC m=+37.632736791 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{2 e-1 ce0}]
            logger.go:38: 2024-10-04 22:49:55.464401677 +0000 UTC m=+38.635102945 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2
            logger.go:38: 2024-10-04 22:49:55.464589199 +0000 UTC m=+38.635290467 level info partition 1 component kafka_client msg immediate metadata update triggered why fetch had inner topic errors from broker 0: OFFSET_OUT_OF_RANGE{test[1]}
            logger.go:38: 2024-10-04 22:49:55.465498716 +0000 UTC m=+38.636199994 level info partition 1 msg starting concurrent fetchers start_offset 2 concurrency 2 recordsPerFetch 2
            logger.go:38: 2024-10-04 22:49:55.464902534 +0000 UTC m=+38.635603802 level info partition 1 target_lag 1s max_lag 1s msg partition reader is starting to consume partition until target and max consumer lag is honored
            logger.go:38: 2024-10-04 22:49:56.46672293 +0000 UTC m=+39.637424198 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2
            logger.go:38: 2024-10-04 22:49:56.466899722 +0000 UTC m=+39.637601000 level info partition 1 component kafka_client msg immediate metadata update triggered why fetch had inner topic errors from broker 0: OFFSET_OUT_OF_RANGE{test[1]}
            logger.go:38: 2024-10-04 22:49:57.46782549 +0000 UTC m=+40.638526[75](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:76)8 partition 1 fetcher 0 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 2.000284732s start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server.
            logger.go:38: 2024-10-04 22:49:57.467963869 +0000 UTC m=+40.638665137 partition 1 fetcher 0 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 2 end_offset 4 msg offset out of range; waiting for new records to be produced
            logger.go:38: 2024-10-04 22:49:58.468389136 +0000 UTC m=+41.639090404 partition 1 fetcher 1 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 3.000746787s start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server.
            logger.go:38: 2024-10-04 22:49:58.468541993 +0000 UTC m=+41.639243281 partition 1 fetcher 1 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 4 end_offset 6 msg offset out of range; waiting for new records to be produced
            logger.go:38: 2024-10-04 22:49:59.46866[76](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:77)69 +0000 UTC m=+42.639368927 partition 1 fetcher 0 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 2.000607529s start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server.
            logger.go:38: 2024-10-04 22:49:59.468797102 +0000 UTC m=+42.639498370 partition 1 fetcher 0 caller log.go:168 level debug msg merged fetch result with the next result
            logger.go:38: 2024-10-04 22:49:59.468870121 +0000 UTC m=+42.639571389 level info partition 1 target_lag 1s max_lag 1s msg partition reader found no records to consume because partition is empty partition_start_offset 2 last_produced_offset 1
            logger.go:38: 2024-10-04 22:49:59.468940944 +0000 UTC m=+42.639642202 level info partition 1 target_lag 1s max_lag 1s msg partition reader consumed partition and current lag is lower than configured target consumer lag last_consumed_offset 1 current_lag 0s
            logger.go:38: 2024-10-04 22:49:59.468868476 +0000 UTC m=+42.639569744 partition 1 fetcher 0 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 2 end_offset 4 msg offset out of range; waiting for new records to be produced
            logger.go:38: 2024-10-04 22:49:59.469023797 +0000 UTC m=+42.639725055 level info partition 1 component kafka_client msg assigning partitions why remove of map[test:[1]] requested how unassigning any currently assigned matching partition that is in the input input test[1{0 e0 ce0}]
            logger.go:38: 2024-10-04 22:49:59.46941875 +0000 UTC m=+42.640120018 level info partition 1 component kafka_client msg immediate metadata update triggered why from AddConsumePartitions
            logger.go:38: 2024-10-04 22:49:59.473710903 +0000 UTC m=+42.644412171 partition 1 fetcher 0 caller log.go:168 level debug msg fetched records duration 4.652679ms start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 0 lso 0 err <nil>
            logger.go:38: 2024-10-04 22:49:59.473716154 +0000 UTC m=+42.644417412 partition 1 fetcher 1 caller log.go:168 level debug msg fetched records duration 1.005101365s start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 0 lso 0 err <nil>
            logger.go:38: 2024-10-04 22:49:59.473892556 +0000 UTC m=+42.644593824 partition 1 fetcher 1 caller log.go:168 level debug msg merged fetch result with the next result
            logger.go:38: 2024-10-04 22:49:59.473809357 +0000 UTC m=+42.644510626 partition 1 fetcher 0 caller log.go:168 level debug msg merged fetch result with the next result
            logger.go:38: 2024-10-04 22:49:59.476605896 +0000 UTC m=+42.647307163 level info partition 1 msg stopping partition reader
            logger.go:38: 2024-10-04 22:49:59.476730199 +0000 UTC m=+42.647431467 level warn partition 1 msg failed to fetch the last produced offset err the internal broker struct chosen to issue this request has died--either the broker id is migrating or no longer exists
            testing.go:1399: race detected during execution of test
==================
WARNING: DATA RACE
Read at 0x00c000035563 by goroutine 45240:
  testing.(*common).logDepth()
      /usr/local/go/src/testing/testing.go:1024 +0x504
  testing.(*common).log()
      /usr/local/go/src/testing/testing.go:1011 +0x7d
  testing.(*common).Log()
      /usr/local/go/src/testing/testing.go:1052 +0x55
  testing.(*T).Log()
      <autogenerated>:1 +0x4f
  github.com/grafana/mimir/pkg/util/test.(*TestingLogger).Log()
      /__w/mimir/mimir/pkg/util/test/logger.go:38 +0x1[77](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:78)
  github.com/go-kit/log.(*context).Log()
      /__w/mimir/mimir/vendor/github.com/go-kit/log/log.go:168 +0x4ba
  github.com/grafana/dskit/spanlogger.(*SpanLogger).Log()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/spanlogger/spanlogger.go:111 +0x54
  github.com/go-kit/log.(*context).Log()
      /__w/mimir/mimir/vendor/github.com/go-kit/log/log.go:168 +0x4ba
  github.com/grafana/mimir/pkg/storage/ingest.(*fetchResult).logCompletedFetch()
      /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:156 +0x123c
  github.com/grafana/mimir/pkg/storage/ingest.(*concurrentFetchers).fetchSingle.func1()
      /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:363 +0x94
  github.com/grafana/mimir/pkg/storage/ingest.(*concurrentFetchers).fetchSingle.deferwrap1()
      /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:364 +0x61
  runtime.deferreturn()
      /usr/local/go/src/runtime/panic.go:605 +0x5d
  github.com/grafana/mimir/pkg/storage/ingest.(*concurrentFetchers).run()
      /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:497 +0x684
  github.com/grafana/mimir/pkg/storage/ingest.(*concurrentFetchers).start.gowrap2()
      /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:568 +0x[79](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:80)

Previous write at 0x00c000035563 by goroutine 35663:
  testing.tRunner.func1()
      /usr/local/go/src/testing/testing.go:1677 +0x8fa
  runtime.deferreturn()
      /usr/local/go/src/runtime/panic.go:605 +0x5d
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1743 +0x44

Goroutine 45240 (running) created at:
  github.com/grafana/mimir/pkg/storage/ingest.(*concurrentFetchers).start()
      /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:568 +0x4f0
  github.com/grafana/mimir/pkg/storage/ingest.newConcurrentFetchers.gowrap1()
      /__w/mimir/mimir/pkg/storage/ingest/fetcher.go:293 +0x79

Goroutine 35663 (finished) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1743 +0x[82](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:83)5
  github.com/grafana/mimir/pkg/storage/ingest.TestPartitionReader_ConsumeAtStartup()
      /__w/mimir/mimir/pkg/storage/ingest/reader_test.go:1367 +0x627
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1690 +0x226
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1743 +0x44
==================
FAIL
FAIL	github.com/grafana/mimir/pkg/storage/ingest	[87](https://github.com/grafana/mimir/actions/runs/11188014503/job/31106091993?pr=9515#step:8:88).471s

@zenador
Copy link
Contributor

zenador commented Oct 10, 2024

https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527

Details

=================
WARNING: DATA RACE
Write at 0x00c0007927e0 by goroutine 46254:
  github.com/twmb/franz-go/pkg/kgo.(*cursor).setOffset()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:284 +0x7a8
  github.com/twmb/franz-go/pkg/kgo.(*cursor).unset()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:259 +0x765
  github.com/twmb/franz-go/pkg/kgo.(*consumer).assignPartitions()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:1025 +0x757
  github.com/twmb/franz-go/pkg/kgo.(*Client).RemoveConsumePartitions()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/consumer.go:874 +0x544
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).switchToOngoingFetcher()
      /__w/mimir/mimir/pkg/storage/ingest/reader.go:280 +0x3a4
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).run()
      /__w/mimir/mimir/pkg/storage/ingest/reader.go:236 +0x3e
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).run-fm()
      <autogenerated>:1 +0x47
  github.com/grafana/dskit/services.(*BasicService).main()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:193 +0x3b7
  github.com/grafana/dskit/services.(*BasicService).StartAsync.func1.gowrap1()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x33

Previous read at 0x00c0007927e0 by goroutine 47610:
  github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp.func2()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1229 +0x124
  github.com/twmb/franz-go/pkg/kgo.(*source).handleReqResp()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:1260 +0x1501
  github.com/twmb/franz-go/pkg/kgo.(*source).fetch.func5()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:934 +0x12c

Goroutine 46254 (running) created at:
  github.com/grafana/dskit/services.(*BasicService).StartAsync.func1()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:122 +0x1dc
  github.com/grafana/dskit/services.(*BasicService).switchState()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:142 +0x115
  github.com/grafana/dskit/services.(*BasicService).StartAsync()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/basic_service.go:119 +0xb1
  github.com/grafana/mimir/pkg/storage/ingest.(*PartitionReader).StartAsync()
      <autogenerated>:1 +0x63
  github.com/grafana/dskit/services.StartAndAwaitRunning()
      /__w/mimir/mimir/vendor/github.com/grafana/dskit/services/services.go:104 +0x49
  github.com/grafana/mimir/pkg/storage/ingest.TestPartitionReader_ConsumeAtStartup.func13.1()
      /__w/mimir/mimir/pkg/storage/ingest/reader_test.go:1425 +0x1018
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1690 +0x226
  testing.(*T).Run.gowrap1()
      /usr/local/go/src/testing/testing.go:1743 +0x44

Goroutine 47610 (finished) created at:
  github.com/twmb/franz-go/pkg/kgo.(*source).fetch()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:932 +0xc04
  github.com/twmb/franz-go/pkg/kgo.(*source).loopFetch()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:787 +0x4c6
  github.com/twmb/franz-go/pkg/kgo.(*source).maybeConsume.gowrap1()
      /__w/mimir/mimir/vendor/github.com/twmb/franz-go/pkg/kgo/source.go:717 +0x33
==================
--- FAIL: TestPartitionReader_ConsumeAtStartup (0.00s)
    --- FAIL: TestPartitionReader_ConsumeAtStartup/should_consume_partition_from_last_committed_offset_if_position=last-offset,_and_wait_until_max_lag_is_honored_if_can't_honor_target_lag (6.10s)
        logging.go:33: level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time
        logging.go:33: level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test
        logging.go:33: level info component kafka_client msg initializing producer id
        logging.go:33: level info component kafka_client msg done waiting for metadata for new topic topic test
        logging.go:33: level info component kafka_client msg producer id initialization success id 4006748771009[72](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:73)7914 epoch 0
        reader_test.go:1243: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1258: produced 1 record
        reader_test.go:1229: artificially slowing down OffsetFetch request by 550ms
        reader_test.go:1258: produced 1 record
        testing.go:1399: race detected during execution of test
    --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_context_is_cancelled_while_fetching_last_produced_offset (8.93s)
        logger.go:38: 2024-10-10 04:16:53.606977776 +0000 UTC m=+34.053[73](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:74)7144 level info partition 1 msg starting consumption from partition start because no offset has been found start_offset -2 consumer_group test-group
        logger.go:38: 2024-10-10 04:16:53.60852153 +0000 UTC m=+34.055280888 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization
        logger.go:38: 2024-10-10 04:16:53.60929543 +0000 UTC m=+34.056054798 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{-2 e-1 ce0}]
        logger.go:38: 2024-10-10 04:16:53.609802089 +0000 UTC m=+34.056561467 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err read tcp 127.0.0.1:58428->127.0.0.1:38907: read: connection reset by peer
        logger.go:38: 2024-10-10 04:16:53.855954002 +0000 UTC m=+34.302713360 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:53.886681183 +0000 UTC m=+34.333440551 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:54.292218019 +0000 UTC m=+34.738977377 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:54.421562839 +0000 UTC m=+34.868322227 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:54.610[74](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:75)7307 +0000 UTC m=+35.057506675 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:55.181056708 +0000 UTC m=+35.627816076 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:55.36460113 +0000 UTC m=+35.811360488 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:55.611916391 +0000 UTC m=+36.0586[75](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:76)759 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:56.613682244 +0000 UTC m=+37.060441612 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:57.094000418 +0000 UTC m=+37.5407597[76](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:77) level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:57.396594373 +0000 UTC m=+37.843353741 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:57.615222887 +0000 UTC m=+38.061982255 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:58.616429473 +0000 UTC m=+39.063188831 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:59.595831082 +0000 UTC m=+40.042590450 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:59.61804281 +0000 UTC m=+40.064802189 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:16:59.897691713 +0000 UTC m=+40.344451071 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:17:00.619479618 +0000 UTC m=+41.066238986 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:17:01.621108936 +0000 UTC m=+42.067868323 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:17:02.097145895 +0000 UTC m=+42.543905253 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:17:02.097325[77](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:78)2 +0000 UTC m=+42.544085140 level warn partition 1 msg failed to fetch the last produced offset err broker closed the connection immediately after a request was issued, which happens when SASL is required but not provided: is SASL missing?
        logger.go:38: 2024-10-10 04:17:02.399074166 +0000 UTC m=+42.845833524 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:38907 broker 0 err EOF
        logger.go:38: 2024-10-10 04:17:02.399246449 +0000 UTC m=+42.846005827 level warn partition 1 msg failed to fetch the last produced offset err broker closed the connection immediately after a request was issued, which happens when SASL is required but not provided: is SASL missing?
        testing.go:1399: race detected during execution of test
    --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_there_are_no_records_to_consume_from_Kafka_but_partition_start_offset_is_>_0_(e.g._all_previous_records_have_been_deleted_by_Kafka_retention) (0.00s)
        --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_there_are_no_records_to_consume_from_Kafka_but_partition_start_offset_is_>_0_(e.g._all_previous_records_have_been_deleted_by_Kafka_retention)/consume_from_position:_timestamp (4.02s)
            logging.go:33: level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test
            logging.go:33: level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time
            logging.go:33: level info component kafka_client msg done waiting for metadata for new topic topic test
            logging.go:33: level info component kafka_client msg initializing producer id
            logging.go:33: level info component kafka_client msg producer id initialization success id 3033693887431518137 epoch 0
            reader_test.go:1398: produced 2 records
            reader_test.go:1407: fetched partition end offset: 2
            reader_test.go:1415: advanced partition start offset to: 2
            logger.go:38: 2024-10-10 04:16:55.275439755 +0000 UTC m=+35.722199124 level info partition 1 msg starting consumption from timestamp timestamp 1728533815272 last_consumed_offset 1 start_offset 2 consumer_group test-group
            logger.go:38: 2024-10-10 04:16:55.279723549 +0000 UTC m=+35.726482917 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization
            logger.go:38: 2024-10-10 04:16:55.280457384 +0000 UTC m=+35.727216752 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{2 e-1 ce0}]
            logger.go:38: 2024-10-10 04:16:56.282164427 +0000 UTC m=+36.728923795 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2
            logger.go:38: 2024-10-10 04:16:56.2823 +0000 UTC m=+36.729059369 level info partition 1 component kafka_client msg immediate metadata update triggered why fetch had inner topic errors from broker 0: OFFSET_OUT_OF_RANGE{test[1]}
            logger.go:38: 2024-10-10 04:16:56.283082546 +0000 UTC m=+36.729841904 level info partition 1 target_lag 1s max_lag 1s msg partition reader is starting to consume partition until target and max consumer lag is honored
            logger.go:38: 2024-10-10 04:16:56.28320188 +0000 UTC m=+36.7299612[78](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:79) level info partition 1 msg starting concurrent fetchers start_offset 2 concurrency 2 recordsPerFetch 2
            logger.go:38: 2024-10-10 04:16:57.283788296 +0000 UTC m=+37.730547664 partition 1 fetcher 0 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 1.000489484s start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server.
            logger.go:38: 2024-10-10 04:16:57.283872724 +0000 UTC m=+37.730632092 partition 1 fetcher 0 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 2 end_offset 4 msg offset out of range; waiting for new records to be produced
            logger.go:38: 2024-10-10 04:16:58.284821546 +0000 UTC m=+38.731580924 partition 1 fetcher 1 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 2.001451842s start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server.
            logger.go:38: 2024-10-10 04:16:58.284972299 +0000 UTC m=+38.731731667 partition 1 fetcher 1 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 4 end_offset 6 msg offset out of range; waiting for new records to be produced
            logger.go:38: 2024-10-10 04:16:59.285415824 +0000 UTC m=+39.732175182 level info partition 1 target_lag 1s max_lag 1s msg partition reader found no records to consume because partition is empty partition_start_offset 2 last_produced_offset 1
            logger.go:38: 2024-10-10 04:16:59.28550985 +0000 UTC m=+39.732269208 level info partition 1 target_lag 1s max_lag 1s msg partition reader consumed partition and current lag is lower than configured target consumer lag last_consumed_offset 1 current_lag 0s
            logger.go:38: 2024-10-10 04:16:59.285598536 +0000 UTC m=+39.73235[79](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:80)14 level info partition 1 component kafka_client msg assigning partitions why remove of map[test:[1]] requested how unassigning any currently assigned matching partition that is in the input input test[1{0 e0 ce0}]
            logger.go:38: 2024-10-10 04:16:59.28542915 +0000 UTC m=+39.732188518 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2
            logger.go:38: 2024-10-10 04:16:59.287273636 +0000 UTC m=+39.734033003 level info partition 1 component kafka_client msg immediate metadata update triggered why from AddConsumePartitions
            logger.go:38: 2024-10-10 04:16:59.288731507 +0000 UTC m=+39.735490885 partition 1 fetcher 1 caller log.go:168 level debug msg fetched records duration 1.003667245s start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 0 lso 0 err <nil>
            logger.go:38: 2024-10-10 04:16:59.288827636 +0000 UTC m=+39.735587005 partition 1 fetcher 1 caller log.go:168 level debug msg merged fetch result with the next result
            logger.go:38: 2024-10-10 04:16:59.28894696 +0000 UTC m=+39.735706318 level info partition 1 msg stopping partition reader
            logger.go:38: 2024-10-10 04:16:59.289101199 +0000 UTC m=+39.735860567 partition 1 fetcher 0 caller log.go:168 level debug msg fetched records duration 2.005147243s start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 0 lso 0 err <nil>
            logger.go:38: 2024-10-10 04:16:59.289181058 +0000 UTC m=+39.735940427 partition 1 fetcher 0 caller log.go:168 level debug msg merged fetch result with the next result
            logger.go:38: 2024-10-10 04:16:59.289522999 +0000 UTC m=+39.736282397 level warn partition 1 msg failed to fetch the last produced offset err context canceled
            testing.go:1399: race detected during execution of test
        --- FAIL: TestPartitionReader_ConsumeAtStartup/should_not_wait_indefinitely_if_there_are_no_records_to_consume_from_Kafka_but_partition_start_offset_is_>_0_(e.g._all_previous_records_have_been_deleted_by_Kafka_retention)/consume_from_position:_end (4.02s)
            logging.go:33: level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time
            logging.go:33: level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test
            logging.go:33: level info component kafka_client msg done waiting for metadata for new topic topic test
            logging.go:33: level info component kafka_client msg initializing producer id
            logging.go:33: level info component kafka_client msg producer id initialization success id 8262787052353573940 epoch 0
            reader_test.go:1398: produced 2 records
            reader_test.go:1407: fetched partition end offset: 2
            reader_test.go:1415: advanced partition start offset to: 2
            logger.go:38: 2024-10-10 04:16:57.297[80](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:81)7347 +0000 UTC m=+37.744566716 level info partition 1 msg starting consumption from timestamp timestamp 1728533[81](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:82)7294 last_consumed_offset 1 start_offset 2 consumer_group test-group
            logger.go:38: 2024-10-10 04:16:57.29944261 +0000 UTC m=+37.746201979 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization
            logger.go:38: 2024-10-10 04:16:57.30111883 +0000 UTC m=+37.74787[82](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:83)09 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{2 e-1 ce0}]
            logger.go:38: 2024-10-10 04:16:58.303079619 +0000 UTC m=+38.749[83](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:84)8987 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2
            logger.go:38: 2024-10-10 04:16:58.303231764 +0000 UTC m=+38.749991132 level info partition 1 component kafka_client msg immediate metadata update triggered why fetch had inner topic errors from broker 0: OFFSET_OUT_OF_RANGE{test[1]}
            logger.go:38: 2024-10-10 04:16:58.304303873 +0000 UTC m=+38.751063231 level info partition 1 target_lag 1s max_lag 1s msg partition reader is starting to consume partition until target and max consumer lag is honored
            logger.go:38: 2024-10-10 04:16:58.304415171 +0000 UTC m=+38.751174539 level info partition 1 msg starting concurrent fetchers start_offset 2 concurrency 2 recordsPerFetch 2
            logger.go:38: 2024-10-10 04:16:59.305628259 +0000 UTC m=+39.752387617 partition 1 fetcher 0 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 1.000967788s start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server.
            logger.go:38: 2024-10-10 04:16:59.305724449 +0000 UTC m=+39.752483817 partition 1 fetcher 0 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 4 end_offset 6 msg offset out of range; waiting for new records to be produced
            logger.go:38: 2024-10-10 04:17:00.3065152[86](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:87) +0000 UTC m=+40.753274654 partition 1 fetcher 1 caller log.go:168 level debug msg received an error while fetching records; will retry after processing received records (if any) duration 2.001992864s start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 2 lso 2 err OFFSET_OUT_OF_RANGE: The requested offset is not within the range of offsets maintained by the server.
            logger.go:38: 2024-10-10 04:17:00.306617558 +0000 UTC m=+40.753376936 partition 1 fetcher 1 method concurrentFetcher.fetch.attempt caller log.go:168 level debug log_start_offset 2 start_offset 2 end_offset 4 msg offset out of range; waiting for new records to be produced
            logger.go:38: 2024-10-10 04:17:01.307463277 +0000 UTC m=+41.754222655 level info partition 1 component kafka_client msg received OFFSET_OUT_OF_RANGE on the first fetch, resetting to the configured ConsumeResetOffset broker 0 topic test partition 1 prior_offset 2
            logger.go:38: 2024-10-10 04:17:01.307620412 +0000 UTC m=+41.754379780 level info partition 1 component kafka_client msg immediate metadata update triggered why fetch had inner topic errors from broker 0: OFFSET_OUT_OF_RANGE{test[1]}
            logger.go:38: 2024-10-10 04:17:01.308063101 +0000 UTC m=+41.754822469 level info partition 1 target_lag 1s max_lag 1s msg partition reader found no records to consume because partition is empty partition_start_offset 2 last_produced_offset 1
            logger.go:38: 2024-10-10 04:17:01.308121991 +0000 UTC m=+41.754881360 level info partition 1 target_lag 1s max_lag 1s msg partition reader consumed partition and current lag is lower than configured target consumer lag last_consumed_offset 1 current_lag 0s
            logger.go:38: 2024-10-10 04:17:01.308204095 +0000 UTC m=+41.754963463 level info partition 1 component kafka_client msg assigning partitions why remove of map[test:[1]] requested how unassigning any currently assigned matching partition that is in the input input test[1{0 e0 ce0}]
            logger.go:38: 2024-10-10 04:17:01.31006444 +0000 UTC m=+41.756823808 level info partition 1 msg stopping partition reader
            logger.go:38: 2024-10-10 04:17:01.311[87](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:88)7194 +0000 UTC m=+41.758636563 partition 1 fetcher 0 caller log.go:168 level debug msg fetched records duration 2.006076292s start_offset 4 end_offset 6 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 0 lso 0 err <nil>
            logger.go:38: 2024-10-10 04:17:01.311967163 +0000 UTC m=+41.758726551 partition 1 fetcher 0 caller log.go:168 level debug msg merged fetch result with the next result
            logger.go:38: 2024-10-10 04:17:01.312043387 +0000 UTC m=+41.75[88](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:89)02755 partition 1 fetcher 1 caller log.go:168 level debug msg fetched records duration 1.005334679s start_offset 2 end_offset 4 asked_records 2 got_records 0 diff_records 2 asked_bytes 1000000 got_bytes 0 diff_bytes 1000000 first_timestamp  last_timestamp  hwm 0 lso 0 err <nil>
            logger.go:38: 2024-10-10 04:17:01.312132574 +0000 UTC m=+41.758[89](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:90)1942 partition 1 fetcher 1 caller log.go:168 level debug msg merged fetch result with the next result
            logger.go:38: 2024-10-10 04:17:01.312423439 +0000 UTC m=+41.75[91](https://github.com/grafana/mimir/actions/runs/11267325901/job/31332271953?pr=9527#step:8:92)82797 level warn partition 1 msg failed to fetch the last produced offset err context canceled
            testing.go:1399: race detected during execution of test
FAIL

@dimitarvdimitrov
Copy link
Contributor

These last occurrences include some races in franz-go and some races in using the testing.T.Log. I opened a PR to fix the latter #9579

@twmb
Copy link

twmb commented Jan 22, 2025

Is this a race I should look at / does it occur with a newer franz-go?

@dimitarvdimitrov
Copy link
Contributor

there hasn't been reports of this since October 2024. I think we can close this and reopen if it happens again. Many thanks for checking on this issue @twmb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants