Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingest storage: fix flaky TestPartitionReader_ConsumeAtStartup #9495

Conversation

dimitarvdimitrov
Copy link
Contributor

franz-go has its own retries and sometimes the call to the library just takes longer than 10s

test with debug logs: see when partitionOffsetClient.fetchPartitionOffset done runs compared to partitionOffsetClient.fetchPartitionOffset (up to 10s after it)

    logger.go:38: 2024-10-02 11:59:42.429605 +0200 CEST m=+0.014903709 level info partition 1 msg starting consumption from partition start because no offset has been found start_offset -2 consumer_group test-group
    logger.go:38: 2024-10-02 11:59:42.430251 +0200 CEST m=+0.015549793 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization
partitionOffsetClient.fetchPartitionOffset 2024-10-02 11:59:42.430466 +0200 CEST m=+0.015764626
partitionOffsetClient.fetchPartitionOffset 2024-10-02 11:59:42.430476 +0200 CEST m=+0.015773959
    logger.go:38: 2024-10-02 11:59:42.43182 +0200 CEST m=+0.017117918 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{-2 e-1 ce0}]
    logger.go:38: 2024-10-02 11:59:42.432098 +0200 CEST m=+0.017395918 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:42.663898 +0200 CEST m=+0.249196668 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:42.721596 +0200 CEST m=+0.306895334 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:43.23766 +0200 CEST m=+0.822959751 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:43.266302 +0200 CEST m=+0.851602418 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:43.434717 +0200 CEST m=+1.020017376 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:44.352493 +0200 CEST m=+1.937795709 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:44.437143 +0200 CEST m=+2.022445501 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:45.439635 +0200 CEST m=+3.024939084 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:46.121036 +0200 CEST m=+3.706341834 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:46.441884 +0200 CEST m=+4.027190168 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:46.623693 +0200 CEST m=+4.208999584 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:47.444235 +0200 CEST m=+5.029544001 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:48.447111 +0200 CEST m=+6.032421459 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:48.623763 +0200 CEST m=+6.209073543 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:49.126325 +0200 CEST m=+6.711636501 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:49.450558 +0200 CEST m=+7.035870251 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:50.453715 +0200 CEST m=+8.039029376 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:51.126234 +0200 CEST m=+8.711549543 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
partitionOffsetClient.fetchPartitionOffset done 2024-10-02 11:59:51.126414 +0200 CEST m=+8.711729876
    logger.go:38: 2024-10-02 11:59:51.126436 +0200 CEST m=+8.711751793 level warn partition 1 msg failed to fetch the last produced offset err broker closed the connection immediately after a request was issued, which happens when SASL is required but not provided: is SASL missing?
    logger.go:38: 2024-10-02 11:59:51.456121 +0200 CEST m=+9.041438084 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:51.629127 +0200 CEST m=+9.214444168 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
partitionOffsetClient.fetchPartitionOffset done 2024-10-02 11:59:51.629274 +0200 CEST m=+9.214590918
    logger.go:38: 2024-10-02 11:59:51.629299 +0200 CEST m=+9.214616209 level warn partition 1 msg failed to fetch the last produced offset err broker closed the connection immediately after a request was issued, which happens when SASL is required but not provided: is SASL missing?
partitionOffsetClient.fetchPartitionOffset 2024-10-02 11:59:51.629432 +0200 CEST m=+9.214748751
partitionOffsetClient.fetchPartitionOffset done 2024-10-02 11:59:51.629704 +0200 CEST m=+9.215021293

What this PR does

Which issue(s) this PR fixes or relates to

Fixes #

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

franz-go has its own retries and sometimes the call to the library just takes longer than 10s

test with debug logs: see when `partitionOffsetClient.fetchPartitionOffset done` runs compared to `partitionOffsetClient.fetchPartitionOffset` (up to 10s after it)

```
    logger.go:38: 2024-10-02 11:59:42.429605 +0200 CEST m=+0.014903709 level info partition 1 msg starting consumption from partition start because no offset has been found start_offset -2 consumer_group test-group
    logger.go:38: 2024-10-02 11:59:42.430251 +0200 CEST m=+0.015549793 level info partition 1 component kafka_client msg immediate metadata update triggered why querying metadata for consumer initialization
partitionOffsetClient.fetchPartitionOffset 2024-10-02 11:59:42.430466 +0200 CEST m=+0.015764626
partitionOffsetClient.fetchPartitionOffset 2024-10-02 11:59:42.430476 +0200 CEST m=+0.015773959
    logger.go:38: 2024-10-02 11:59:42.43182 +0200 CEST m=+0.017117918 level info partition 1 component kafka_client msg assigning partitions why new assignments from direct consumer how assigning everything new, keeping current assignment input test[1{-2 e-1 ce0}]
    logger.go:38: 2024-10-02 11:59:42.432098 +0200 CEST m=+0.017395918 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:42.663898 +0200 CEST m=+0.249196668 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:42.721596 +0200 CEST m=+0.306895334 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:43.23766 +0200 CEST m=+0.822959751 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:43.266302 +0200 CEST m=+0.851602418 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:43.434717 +0200 CEST m=+1.020017376 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:44.352493 +0200 CEST m=+1.937795709 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:44.437143 +0200 CEST m=+2.022445501 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:45.439635 +0200 CEST m=+3.024939084 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:46.121036 +0200 CEST m=+3.706341834 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:46.441884 +0200 CEST m=+4.027190168 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:46.623693 +0200 CEST m=+4.208999584 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:47.444235 +0200 CEST m=+5.029544001 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:48.447111 +0200 CEST m=+6.032421459 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:48.623763 +0200 CEST m=+6.209073543 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:49.126325 +0200 CEST m=+6.711636501 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:49.450558 +0200 CEST m=+7.035870251 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:50.453715 +0200 CEST m=+8.039029376 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:51.126234 +0200 CEST m=+8.711549543 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
partitionOffsetClient.fetchPartitionOffset done 2024-10-02 11:59:51.126414 +0200 CEST m=+8.711729876
    logger.go:38: 2024-10-02 11:59:51.126436 +0200 CEST m=+8.711751793 level warn partition 1 msg failed to fetch the last produced offset err broker closed the connection immediately after a request was issued, which happens when SASL is required but not provided: is SASL missing?
    logger.go:38: 2024-10-02 11:59:51.456121 +0200 CEST m=+9.041438084 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
    logger.go:38: 2024-10-02 11:59:51.629127 +0200 CEST m=+9.214444168 level warn partition 1 component kafka_client msg read from broker errored, killing connection after 0 successful responses (is SASL missing?) req ListOffsets addr 127.0.0.1:49726 broker 0 err EOF
partitionOffsetClient.fetchPartitionOffset done 2024-10-02 11:59:51.629274 +0200 CEST m=+9.214590918
    logger.go:38: 2024-10-02 11:59:51.629299 +0200 CEST m=+9.214616209 level warn partition 1 msg failed to fetch the last produced offset err broker closed the connection immediately after a request was issued, which happens when SASL is required but not provided: is SASL missing?
partitionOffsetClient.fetchPartitionOffset 2024-10-02 11:59:51.629432 +0200 CEST m=+9.214748751
partitionOffsetClient.fetchPartitionOffset done 2024-10-02 11:59:51.629704 +0200 CEST m=+9.215021293
```

Signed-off-by: Dimitar Dimitrov <[email protected]>
@dimitarvdimitrov dimitarvdimitrov requested a review from a team as a code owner October 2, 2024 10:01
Copy link
Contributor

@narqo narqo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥

Seems that it fixes my comment here #8691 (comment)

@dimitarvdimitrov
Copy link
Contributor Author

i think that got introduced with #9436

@dimitarvdimitrov dimitarvdimitrov merged commit ac6efd0 into main Oct 2, 2024
29 checks passed
@dimitarvdimitrov dimitarvdimitrov deleted the dimitar/ingest/fix-flaky-TestPartitionReader_ConsumeAtStartup branch October 2, 2024 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants