-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: set test-fs-watch as flaky #50250
test: set test-fs-watch as flaky #50250
Conversation
Fast-track has been requested by @anonrig. Please 👍 to approve. |
The referenced issue was opened 2 hours ago. I think we should at least ping @nodejs/platform-s390 before marking the test flaky. |
s390 team can always follow up and unflake the issue. the other way around, a.k.a. waiting for someone to respond, will only frustrate the existing pull requests and increase our CI work pressure. I recommend setting them flaky first, and later follow up when/if someone from the team is available. We should act these flaky test declarations as a todo list. Also we should remember that there is always an open issue about the flakiness of this test. |
I disagree, was this discussed in a TSC meeting? It is too easy to mark a test flaky and forget about it the first time it fails. No one will ever going to look into it if there isn't a constant reminder/annoyance that the test keeps failing. The list of flaky tests will only grow over time instead of reducing. |
@lpinca It was not officially discussed. It was mostly couple of folks saying "Yeah let's do it". cc @nodejs/tsc for visibility |
67ed76e
to
3ed1053
Compare
Ok, fwiw I think it is not a good idea. |
I agree we should have a threshold for flakes versus a first time failure. I'm going to object to the fast track so that we have some time for the discussion. |
I've also added to the tsc-agenda so we have a discussion/feedback on what might be an appropriate threshold. |
@mhdawson Is there any reason for not-merging this pull request? Is tsc-agenda label related to this PR specifically or a general discussion about the conversation above? |
While it has had two reviews I think with people against this, and I'll include myself in that, it makes sense to pause the merging. Having said that if we are struggling to get a response from the team maintaining that then we may have a wider issue (I think I may technically be on the @nodejs/platform-s390 team but I'm not actively maintaining the port there) If we can't get a response in another week or so then I'd likely be in favour, so at this stage I'm +1 on having it as part of a TSC discussion before making the decision on merging. |
Although, I agree with where this is going this means that all pull requests in the next week will face this flaky test and potentially result in overwhelmed contributors, which is the main reason of fast tracking these tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
So it's failing 100% of the time at the moment? |
From the reliability reports the last time this failed was Oct 10th - nodejs/reliability#690, so it's not every run. The reliability report shows that it failed a number of times on both macos and s390 between the Oct 8th and Oct 10th. It's not failed since. I think this is a possibly example for not being too hasty about excluding a test as it's not so flaky that it affects all builds since we've not seen failures in the last 13 days. [EDIT searching on the reliability reports I think some may not show up as I can't always find an entry in the report when we have a report in a CI]. Since it was also failing on multiple platforms, I'm not sure its a good candidate for wanting a response from a particular platform team either. In one of the failing runs on osx (https://ci.nodejs.org/job/node-test-commit-osx-arm/13767/), it looked like this:
With multiple tests failing like that I think thats less often a flaky test and more often something that needs to be cleaned up on the machine. So unless I'm looking at the data wrong, I don't think we should disable this test at this point. It's not failed in 13 days and some of the failures looked more like machine issues. Separately I think it would be good to agree/document something around marking tests as flaky in terms of wether it's ok to mark flaky after seeing one failure, or if not what threshold we think makes sense. |
@mhdawson If you resume a build and make sure the flaky test succeeds before the daily reliability check being done, you can spoof the reliability report. The following test link from the original issue is an example of this: https://ci.nodejs.org/job/node-test-commit-linuxone/40398/ I think this is the fundamental issue of reliability reports. |
That's not how reliability reports work. The reports look back on the 100 most recent runs of node-test-pull-request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the last time this test has been reported as failing was on nodejs/reliability#690, so one month ago, and nothing shows up since. Are we sure this is still relevant?
Adding a request for changes so it doesn't land until we have confirmation it is still flaky.
Ref: #50249