Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky TestConcurrentFetchers #9916

Closed
flxbk opened this issue Nov 15, 2024 · 2 comments · Fixed by #9926 or #10567
Closed

Flaky TestConcurrentFetchers #9916

flxbk opened this issue Nov 15, 2024 · 2 comments · Fixed by #9926 or #10567

Comments

@flxbk
Copy link
Contributor

flxbk commented Nov 15, 2024

This test seems to be flaky (CI run)

--- FAIL: TestConcurrentFetchers (2.11s)
    --- FAIL: TestConcurrentFetchers/update_concurrency_with_continuous_production (10.00s)
        logging.go:33: level info component kafka_client msg immediate metadata update triggered why from user ForceMetadataRefresh
        logging.go:33: level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time
        logging.go:33: level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test-topic
        logging.go:33: level info component kafka_client msg done waiting for metadata for new topic topic test-topic
        logging.go:33: level info component kafka_client msg initializing producer id
        logging.go:33: level info component kafka_client msg producer id initialization success id 8363768319828142695 epoch 0
        reader_test.go:2332: 
            	Error Trace:	/__w/mimir/mimir/pkg/storage/ingest/reader_test.go:2332
            	            				/__w/mimir/mimir/pkg/storage/ingest/fetcher_test.go:650
            	            				/usr/local/go/src/runtime/asm_amd64.s:1700
            	Error:      	Received unexpected error:
            	            	context canceled
            	Test:       	TestConcurrentFetchers/update_concurrency_with_continuous_production
        fetcher_test.go:692: 
            	Error Trace:	/__w/mimir/mimir/pkg/storage/ingest/fetcher_test.go:692
            	Error:      	Not equal: 
            	            	expected: 646
            	            	actual  : 647
            	Test:       	TestConcurrentFetchers/update_concurrency_with_continuous_production
            	Messages:   	Should not fetch more records than produced
        fetcher_test.go:706: Total produced: 647, Total fetched: 646
        fetcher_test.go:707: Fetched with initial concurrency: 199
        fetcher_test.go:708: Fetched with high concurrency: 300
        fetcher_test.go:709: Fetched with low concurrency: 147
FAIL
FAIL	github.com/grafana/mimir/pkg/storage/ingest	117.983s
@zenador
Copy link
Contributor

zenador commented Dec 18, 2024

Looks like it's still happening:
https://github.com/grafana/mimir/actions/runs/12400788936/job/34618765095?pr=10277

 --- FAIL: TestConcurrentFetchers_fetchSingle/should_return_an_error_response_if_the_Fetch_request_contains_an_error (0.00s)
    fetcher_test.go:1022: 
        	Error Trace:	/__w/mimir/mimir/pkg/storage/ingest/fetcher_test.go:1022
        	Error:      	An error is expected but got nil.
        	Test:       	TestConcurrentFetchers_fetchSingle/should_return_an_error_response_if_the_Fetch_request_contains_an_error
logger.go:38: 2024-12-18 20:29:44.759052986 +0000 UTC m=+18.131722032 level info msg stopped concurrent fetchers last_returned_offset -1

@zenador zenador reopened this Dec 18, 2024
@dimitarvdimitrov
Copy link
Contributor

happened again. I'll try to take a look

Details

--- FAIL: TestConcurrentFetchers_fetchSingle (0.04s)
    logger.go:38: 2025-02-03 09:58:38.246100548 +0000 UTC m=+17.633705268 level info component kafka_client msg immediate metadata update triggered why forced load because we are producing to a topic for the first time
    logger.go:38: 2025-02-03 09:58:38.246184553 +0000 UTC m=+17.633789263 level info component kafka_client msg producing to a new topic for the first time, fetching metadata to learn its partitions topic test-topic
    logger.go:38: 2025-02-03 09:58:38.246230027 +0000 UTC m=+17.633834747 level info msg starting concurrent fetchers start_offset 0 concurrency 1 bytes_per_fetch_request 2147483647
    logger.go:38: 2025-02-03 09:58:38.24642079 +0000 UTC m=+17.634025501 fetcher 0 level error msg received an error while fetching records; will retry after processing received records (if any) duration 16.57µs start_offset 0 end_offset 214748 asked_records 214748 got_records 0 diff_records 214748 asked_bytes 2147483647 got_bytes 0 diff_bytes 2147483647 first_timestamp  last_timestamp  hwm 0 lso 0 err unknown partition leader
    logger.go:38: 2025-02-03 09:58:38.247061878 +0000 UTC m=+17.634666598 level info component kafka_client msg immediate metadata update triggered why from user ForceMetadataRefresh
    logger.go:38: 2025-02-03 09:58:38.247113994 +0000 UTC m=+17.634718704 level info component kafka_client msg done waiting for metadata for new topic topic test-topic
    logger.go:38: 2025-02-03 09:58:38.247166321 +0000 UTC m=+17.634771041 level info component kafka_client msg initializing producer id
    logger.go:38: 2025-02-03 09:58:38.247393652 +0000 UTC m=+17.634998362 level info component kafka_client msg producer id initialization success id 1259831244926160855 epoch 0
    logger.go:38: 2025-02-03 09:58:38.25741013 +0000 UTC m=+17.645014850 fetcher 0 level error msg received an error while fetching records; will retry after processing received records (if any) duration 821.902µs start_offset 0 end_offset 214748 asked_records 214748 got_records 0 diff_records 214748 asked_bytes 2147483647 got_bytes 0 diff_bytes 2147483647 first_timestamp  last_timestamp  hwm 0 lso 0 err fetch request failed with error: UNKNOWN_SERVER_ERROR: The server experienced an unexpected error when processing the request.
    logger.go:38: 2025-02-03 09:58:38.257503373 +0000 UTC m=+17.645108093 fetcher 0 method concurrentFetcher.fetch.attempt level error msg received an error we're not prepared to handle; this shouldn't happen; please report this as a bug err fetch request failed with error: UNKNOWN_SERVER_ERROR: The server experienced an unexpected error when processing the request.
    --- FAIL: TestConcurrentFetchers_fetchSingle/should_return_an_error_response_if_the_Fetch_request_contains_an_error (0.00s)
        fetcher_test.go:1022: 
            	Error Trace:	/__w/mimir/mimir/pkg/storage/ingest/fetcher_test.go:1022
            	Error:      	An error is expected but got nil.
            	Test:       	TestConcurrentFetchers_fetchSingle/should_return_an_error_response_if_the_Fetch_request_contains_an_error
    logger.go:38: 2025-02-03 09:58:38.281458196 +0000 UTC m=+17.669062926 level info msg stopped concurrent fetchers last_returned_offset -1
FAIL

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants