Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increased flakiness of Jest tests is getting out of hand #141477

Closed
spalger opened this issue Sep 22, 2022 · 8 comments
Closed

Increased flakiness of Jest tests is getting out of hand #141477

spalger opened this issue Sep 22, 2022 · 8 comments
Labels
Feature:CI Continuous integration Team:Operations Team label for Operations Team test-unit

Comments

@spalger
Copy link
Contributor

spalger commented Sep 22, 2022

We've been seeing an increase in the flakiness of many Jest tests, which has lead to us skipping many tests and spending a lot of time debugging libraries like react-dom/test-utils (#139444) which seems to be related to the flakiness with errors like:

/var/lib/buildkite-agent/builds/kb-n2-4-spot-33c7d5a09cf722a3/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom.development.js:3905
      var evt = document.createEvent('Event');
                         ^
TypeError: Cannot read properties of null (reading 'createEvent')
    at Object.invokeGuardedCallbackDev (/var/lib/buildkite-agent/builds/kb-n2-4-spot-33c7d5a09cf722a3/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom.development.js:3905:26)
    at invokeGuardedCallback (/var/lib/buildkite-agent/builds/kb-n2-4-spot-33c7d5a09cf722a3/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom.development.js:4056:31)
    at flushPassiveEffectsImpl (/var/lib/buildkite-agent/builds/kb-n2-4-spot-33c7d5a09cf722a3/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom.development.js:23543:11)
    at unstable_runWithPriority (/var/lib/buildkite-agent/builds/kb-n2-4-spot-33c7d5a09cf722a3/elastic/kibana-pull-request/kibana/node_modules/scheduler/cjs/scheduler.development.js:468:12)
    at runWithPriority$1 (/var/lib/buildkite-agent/builds/kb-n2-4-spot-33c7d5a09cf722a3/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom.development.js:11276:10)
    at flushPassiveEffects (/var/lib/buildkite-agent/builds/kb-n2-4-spot-33c7d5a09cf722a3/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom.development.js:23447:14)
    at Object.<anonymous>.flushWork (/var/lib/buildkite-agent/builds/kb-n2-4-spot-33c7d5a09cf722a3/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom-test-utils.development.js:992:10)
    at Immediate.<anonymous> (/var/lib/buildkite-agent/builds/kb-n2-4-spot-33c7d5a09cf722a3/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom-test-utils.development.js:1003:11)
    at processImmediate (node:internal/timers:466:21)

All that said, it's not clear what is causing this problem so I wanted to open this as an issue were we can collect information about this problem. I'm going to start by running our unit tests "in band" once again, which should help in the short-term, but hopefully we can identify a better long term solution.

@spalger spalger added Team:Operations Team label for Operations Team Feature:CI Continuous integration test-unit labels Sep 22, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-operations (Team:Operations)

@spalger
Copy link
Contributor Author

spalger commented Sep 22, 2022

First failed test issue were I did the most research on this problem: #115307

@spalger
Copy link
Contributor Author

spalger commented Sep 22, 2022

Bug being tracked by Jest: jestjs/jest#12670

@spalger
Copy link
Contributor Author

spalger commented Sep 23, 2022

Interesting development, I setup the tests to run "in band" again and now we've seen the x-pack/plugins/watcher/jest.config.js config fail before running any tests:

/var/lib/buildkite-agent/builds/kb-n2-4-spot-bbac38a6adef8454/elastic/kibana-on-merge/kibana/node_modules/react-dom/cjs/react-dom.development.js:3905
      var evt = document.createEvent('Event');
                         ^
TypeError: Cannot read properties of null (reading 'createEvent')
    at Object.invokeGuardedCallbackDev (/var/lib/buildkite-agent/builds/kb-n2-4-spot-bbac38a6adef8454/elastic/kibana-on-merge/kibana/node_modules/react-dom/cjs/react-dom.development.js:3905:26)
    at invokeGuardedCallback (/var/lib/buildkite-agent/builds/kb-n2-4-spot-bbac38a6adef8454/elastic/kibana-on-merge/kibana/node_modules/react-dom/cjs/react-dom.development.js:4056:31)
    ...

Failures on main:

I'm now looking to skip the entire watcher config in #141677

@spalger
Copy link
Contributor Author

spalger commented Oct 19, 2022

Looks like x-pack/plugins/security_solution/public/timelines/jest.config.js might also be really flaky. Going to do a bit more research to figure out how flaky it is

@cee-chen
Copy link
Contributor

cee-chen commented Nov 1, 2022

EUI's findings on consistent var evt = document.createEvent('Event'); "failures" (not sure that it was the failure exactly - more that it was what was getting logged by CI) in x-pack/plugins/triggers_actions_ui/jest.config.js:

Buildkite logs:

Code snippet in case the above links expire

actual full command is:
NODE_OPTIONS="--max-old-space-size=14336" node ./scripts/jest --config="x-pack/plugins/triggers_actions_ui/jest.config.js" --runInBand --coverage=false --passWithNoTests
/var/lib/buildkite-agent/builds/kb-n2-4-spot-bc40503a48a16364/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom.development.js:3905
      var evt = document.createEvent('Event');
                         ^
TypeError: Cannot read properties of null (reading 'createEvent')
    at Object.invokeGuardedCallbackDev (/var/lib/buildkite-agent/builds/kb-n2-4-spot-bc40503a48a16364/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom.development.js:3905:26)
    at invokeGuardedCallback (/var/lib/buildkite-agent/builds/kb-n2-4-spot-bc40503a48a16364/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom.development.js:4056:31)
    at flushPassiveEffectsImpl (/var/lib/buildkite-agent/builds/kb-n2-4-spot-bc40503a48a16364/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom.development.js:23574:9)
    at unstable_runWithPriority (/var/lib/buildkite-agent/builds/kb-n2-4-spot-bc40503a48a16364/elastic/kibana-pull-request/kibana/node_modules/scheduler/cjs/scheduler.development.js:468:12)
    at runWithPriority$1 (/var/lib/buildkite-agent/builds/kb-n2-4-spot-bc40503a48a16364/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom.development.js:11276:10)
    at flushPassiveEffects (/var/lib/buildkite-agent/builds/kb-n2-4-spot-bc40503a48a16364/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom.development.js:23447:14)
    at Object.<anonymous>.flushWork (/var/lib/buildkite-agent/builds/kb-n2-4-spot-bc40503a48a16364/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom-test-utils.development.js:992:10)
    at Immediate.<anonymous> (/var/lib/buildkite-agent/builds/kb-n2-4-spot-bc40503a48a16364/elastic/kibana-pull-request/kibana/node_modules/react-dom/cjs/react-dom-test-utils.development.js:1003:11)
    at processImmediate (node:internal/timers:466:21)
Jest exited with code 1

Debug process:

  • I ran node scripts/jest --config x-pack/plugins/triggers_actions_ui/jest.config.js locally which did indeed report several (5+) failing tests, unlike CI which only reported the document.createEvent error
  • After fixing all tests locally and x-pack/plugins/triggers_actions_ui/jest.config.js also ostensibly passing locally, I pushed up all test changes/updates to upstream
  • CI still failed without a meaningful error message, so I went back into local
  • I noticed that the last test to run was x-pack/plugins/triggers_actions_ui/public/application/sections/rules_list/components/rules_list.test.tsx and took a significant amount of time to run. I also noticed for some odd reason that failures were being reported in the Jest summary while tests were running, but after the final rules_list file ran and the entire suite finished, it somehow reported in 0 failures.
  • I skipped every describe block with rules_list.test.tsx to test and then re-ran the entire suite, and interestingly enough that caused another completely random/unrelated test to fail with a timeout:
  ● rules › render a list of rules

    thrown: "Exceeded timeout of 5000 ms for a test.
    Use jest.setTimeout(newTimeout) to increase the timeout value, if this is a long-running test."

      at x-pack/plugins/triggers_actions_ui/public/application/sections/rule_details/components/rule.test.tsx:66:3
      at Object.<anonymous> (x-pack/plugins/triggers_actions_ui/public/application/sections/rule_details/components/rule.test.tsx:65:1)
  • Just to test, I pushed up skipping all tests inside rules_list.test.tsx to CI and CI stopped failing 🤷

My best uneducated guess is that there's tests in there with async issues tripping up Jest's failure reporting. I repeatedly saw many console errors/warnings about actions not wrapped in act while running the entire Jest suite. My other guess is that the reason why the EUI upgrade caused this to happen consistently is that the tests within triggers_actions_ui were already close to or on the edge of timing out and our various changes/Emotion conversions tipped at least one suite over into a timeout.

@cee-chen
Copy link
Contributor

cee-chen commented Nov 14, 2022

Update to this - x-pack/plugins/triggers_actions_ui/jest.config.js is failing consistently again on the next EUI upgrade (edit: and apparently on main today as well - #145186) - I left them a detailed comment and full set of logs here: #144445 (comment)

@jbudz
Copy link
Member

jbudz commented Jul 13, 2023

The disabled suite mentioned above has been re-enabled and we haven't seen any recent occurrences of this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:CI Continuous integration Team:Operations Team label for Operations Team test-unit
Projects
None yet
Development

No branches or pull requests

4 participants