-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flaky integration tests #4640
Comments
Hey @annasong20, I saw issue, I'm learning how to write go-tests. |
@rajatgupta24 Sure, go for it! |
/triage accepted |
After a flaky run, I found that all flaky tests failed on git checkout FETCH_HEAD in cloner. Given that we run tests concurrently, I believe the flaky tests fail when repos from the different tests are cloned concurrently and We can fix this either by changing the line |
FYI @annasong20 I looked into it a little bit and I think you might be able to replace https://github.com/kubernetes-sigs/kustomize/blob/master/api/internal/git/cloner.go#L33-36 with |
/assign |
If there isn't a clean threadsafe way to write the git commands, we can also consider locking these critical lines of code with a mutex. |
Per offline discussion, it has come to our attention that it being a concurrency issue is unlikely as all the remote tests are running in the same package and therefore not running in parallel. |
My update after looking into this issue some more:
|
I'm not sure this fully explains it, but the sheer size of our repo now that it has a docs site again is likely a contributing factor. We should consider preventing submodule initalization and raising the timeout on tests where this isn't important (Kustomize supports this already). That said, @natasha41575 @annasong20 and I had a discussion about this suite, and I proposed that we go back to the coverage we want to have instead of necessarily fixing the tests in their current form. Here's the tentative plan for what we want: @mightyguava if you're able to help us with this, we would greatly appreciate it. |
/assign @mightyguava |
Yes, 🤞 those fixed it. We'll need some time to see enough CI runs to feel completely confident (e.g. that we don't need to add retries to the protocol tests), but we can close this for now and reopen if we discover there's more to be done. /close |
@KnVerey: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Describe the bug
Many integration tests that
kustomize build
urls in remoteload_test.go are flaky. They exhibit intended behavior on my machine, but sporadically fail when run for every PR on the server.Files that can reproduce the issue
We have observed the following flaky tests:
Expected output
The expected output is written in the test cases.
Actual output
On my local machine, the output is as expected. On the server, the tests mostly pass, but occasionally fail. This logs the output of some of the flaky tests on a server run.
Kustomize version
I ran the tests on the master branch, where HEAD was at commit 22668ea.
Platform
I use macOS. The tests only fail for macOS (not Linux) on the server.
Additional context
Issue #4623 also mentions this issue.
The text was updated successfully, but these errors were encountered: