-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Failing Test]: dataflow runner worker project test stuck causing Java PreCommit time out #28957
Comments
The original stucked test was fixed. Now It appears there is another stuck test:
It was added in #28835 |
Another occurrance: https://github.com/apache/beam/pull/29723/checks?check_run_id=19571889999 this happens in intermediate frequency (>10%) |
Bump to P1 as this is somewhat frequent, and not yet clear if this suggests some regression on the streaming worker |
@Abacn Can the test be disabled or sickbayed before it is investigated? This code is not yet used in pipelines. |
If it is not used in actual pipeline, its of lower risk and yes could be disabled or sickbayed update: downgrade to P2 for now |
Yea let's disable. This has caused weeks of delay for some major changes. |
Currently the test is disabled. Leave this bug open to track the fixing test TODO |
StreamingEngineClientTest.testScheduledBudgetRefresh is also flaky: https://github.com/apache/beam/runs/19993891236 also added in #28835 |
@m-trieu Martin can you look into fixing the test flakiness? |
testStreamsStartCorrectly (org.apache.beam.runners.dataflow.worker.windmill.client.grpc.StreamingEngineClientTest) failed
https://github.com/apache/beam/pull/30245/checks?check_run_id=21501494900 |
will take a look today |
have a potential fix #30322 |
Other flaky test: testLatencyAttributionToQueuedState: https://github.com/apache/beam/runs/22270690743
testInvalidateStuckCommits: https://github.com/apache/beam/runs/22276370706
|
The flakiness is pretty close to perma-red. I think it is best to rollback while we fix it if #30322 does not work right away. |
testConsumedWorkItems_itemsSplitAcrossResponses (org.apache.beam.runners.dataflow.worker.windmill.client.grpc.GrpcDirectGetWorkStreamTest) failed org.junit.runners.model.TestTimedOutException: test timed out after 600 seconds testMultimapLazyIterateHugeEntriesResult (org.apache.beam.runners.dataflow.worker.windmill.state.WindmillStateInternalsTest) failed java.lang.reflect.InaccessibleObjectException: Unable to make field private final java.lang.String java.lang.module.ModuleDescriptor.name accessible: module java.base does not "opens java.lang.module" to unnamed module @675d8c96 testConsumedWorkItems_itemsSplitAcrossResponses (org.apache.beam.runners.dataflow.worker.windmill.client.grpc.GrpcDirectGetWorkStreamTest) failed org.junit.runners.model.TestTimedOutException: test timed out after 600 seconds testUnboundedSourcesDrain[0: [streamingEngine=false]] (org.apache.beam.runners.dataflow.worker.StreamingDataflowWorkerTest) failed java.lang.AssertionError: |
This issue was opened too long ago (October 12, 2023). We decided to track all failing tests in corresponding |
What happened?
It appears that PreCommit timeout happens more frequently recently.
the succeeded job run after the timed out run shows the following tests are added, which suggests the stuck project is
runners:google-cloud-dataflow-java:worker
:see https://ci-beam.apache.org/view/PostCommit/job/beam_PreCommit_Java_Cron/7455/testReport/
Issue Failure
Failure: Test is flaky
Issue Priority
Priority: 2 (backlog / disabled test but we think the product is healthy)
Issue Components
The text was updated successfully, but these errors were encountered: