Test cron removal. #15569

samwho · 2025-02-18T10:50:56Z

Description

In #15566 I was annoyed that I wasn't able to find a nice way to test that cron automations get removed correctly from Bull. In this PR, I think I've figured out a nice enough way to do it.

It boils down to capturing the cron message that goes into the queue and exposing a method to re-trigger that message in tests. With this, we ensure that when the job triggers, it's using the correct cron message, and this then correctly triggers the removal from the queue when it fails 5 times.

qa-wolf · 2025-02-18T10:51:03Z

QA Wolf here! As you write new code it's important that your test coverage is keeping up.
Click here to request test coverage for this PR!

samwho · 2025-02-18T11:09:09Z

packages/backend-core/src/queue/inMemoryQueue.ts

@@ -86,10 +87,13 @@ class InMemoryQueue implements Partial<Queue> {
   */
  async process(concurrencyOrFunc: number | any, func?: any) {
    func = typeof concurrencyOrFunc === "number" ? func : concurrencyOrFunc
-    this._emitter.on("message", async message => {
+    this._emitter.on("message", async msg => {
+      const message = cloneDeep(msg)


Hilariously this is what was causing those loop.spec.ts tests to pass without timing out. The orchestrator removes appId from the job message, which mutates the original message, causing all future invocations of that message to fail. Preserving the appId fixes these, but causes loop.spec.ts to then start timing out because it's doing a lot more work than before.

samwho · 2025-02-18T11:10:09Z

packages/backend-core/src/queue/inMemoryQueue.ts

+    this._emitter.on("message", async msg => {
+      const message = cloneDeep(msg)
+
+      const isManualTrigger = (message as any).manualTrigger === true


This is part of the mechanism I'm using to trigger cron messages. On app publish they get queued without this manualTrigger property, and in that instance we ignore them. Then later when we trigger them manually, they have this property attached to them to distinguish them from the initial message. It's a bit dirty, but results in a much more realistic test.

samwho · 2025-02-18T11:11:10Z

packages/backend-core/src/queue/inMemoryQueue.ts

+  async add(data: T | string, optsOrT?: JobOptions | T) {
+    if (typeof data === "string") {
+      throw new Error("doesn't support named jobs")
+    }
+
+    const opts = optsOrT as JobOptions


These changes were required to do the correct generic typing on the class (InMemoryQueue<T> as opposed to just InMemoryQueue).

samwho · 2025-02-18T11:12:51Z

packages/backend-core/src/queue/inMemoryQueue.ts

@@ -176,7 +186,7 @@ class InMemoryQueue implements Partial<Queue> {

  async removeRepeatableByKey(id: string) {
    for (const [idx, message] of this._messages.entries()) {
-      if (message.opts?.jobId?.toString() === id) {
+      if (message.id === id) {


In getRepeatableJobs we transform each message to a JobInformation object. This sets .key to be the same as .id. We're relying on that fact here, as messages are not stored with a .key property.

samwho · 2025-02-18T11:13:46Z

packages/server/src/automations/logging/index.ts

+  try {
+    await automations.logs.storeLog(automation, results)
+  } catch (e: any) {
+    if (e.status === 413 && e.request?.data) {
+      // if content is too large we shouldn't log it
+      delete e.request.data
+      e.request.data = { message: "removed due to large size" }
+    }
+    logging.logAlert("Error writing automation log", e)
+  }


This handling was moved from the automation orchestrator to here because it seems to me that we should probably always log this when a log fails because it's too big.

samwho · 2025-02-18T11:14:22Z

packages/server/src/automations/tests/steps/loop.spec.ts

+    const { automations } = await config.api.automation.fetch()
+    for (const automation of automations) {
+      await config.api.automation.delete(automation)
+    }


Fix for the timeouts mentioned earlier in this review. This makes it such that there are fewer other automations to trigger when working with each test's specific automation.

samwho · 2025-02-18T11:15:12Z

packages/server/src/automations/tests/triggers/cron.spec.ts

+      let results: Job<AutomationData>[] = []
+      const removed = await captureAutomationRemovals(automation, async () => {
+        results = await captureAutomationResults(automation, async () => {
+          for (let i = 0; i < MAX_AUTOMATION_RECURRING_ERRORS; i++) {
+            triggerCron(message)
+          }
+        })
+      })
+
+      expect(removed).toHaveLength(1)
+      expect(removed[0].id).toEqual(message.id)


This is the meat of the PR, the test that makes sure jobs are removed from Bull when they fail >5 times. This nesting of capture* functions is a bit ugly but this is the only place we have to do it, so I'm not super worried.

samwho · 2025-02-18T11:15:54Z

packages/server/src/automations/tests/utilities/index.ts

+export function getTestQueue(): queue.InMemoryQueue<AutomationData> {
+  return getQueue() as unknown as queue.InMemoryQueue<AutomationData>
+}


This is what I was trying to avoid in the previous PR, but I bit the bullet and did it. I wanted to just use methods on the Bull.Queue interface but that was holding me back from being able to manually retrigger cron messages.

packages/server/src/automations/tests/utilities/index.ts

adrinr · 2025-02-18T12:48:39Z

packages/server/src/threads/automation.ts

+        return metadata.errorCount
+      } catch (error: any) {
+        err = error
+        await helpers.wait(1000 + Math.random() * 1000)


Why the randomness?

When multiple things are fighting to write to a document, a cheap way to make it less likely they'll conflict is to wait a random amount of time. It's not strictly necessary but makes conflicts less likely.

samwho added 2 commits February 18, 2025 10:48

Test cron removal.

a5709c9

Remove .only

318c96e

github-actions bot added firestorm Data/Infra/Revenue Team size/m labels Feb 18, 2025

Fix loop.spec.ts timeout failures.

9c445c1

samwho commented Feb 18, 2025

View reviewed changes

Cleanup.

5aeac61

samwho marked this pull request as ready for review February 18, 2025 11:17

samwho requested a review from a team as a code owner February 18, 2025 11:17

samwho requested review from mike12345567 and adrinr and removed request for a team February 18, 2025 11:17

adrinr approved these changes Feb 18, 2025

View reviewed changes

Respond to Adri's feedback.

6952ca3

samwho enabled auto-merge February 18, 2025 14:03

samwho merged commit 670b409 into master Feb 18, 2025
20 checks passed

samwho deleted the automation-tests-9 branch February 18, 2025 14:10

github-actions bot locked and limited conversation to collaborators Feb 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test cron removal. #15569

Test cron removal. #15569

samwho commented Feb 18, 2025

qa-wolf bot commented Feb 18, 2025

samwho Feb 18, 2025

samwho Feb 18, 2025

samwho Feb 18, 2025

samwho Feb 18, 2025

samwho Feb 18, 2025

samwho Feb 18, 2025

samwho Feb 18, 2025

samwho Feb 18, 2025

adrinr Feb 18, 2025

samwho Feb 18, 2025

Test cron removal. #15569

Test cron removal. #15569

Conversation

samwho commented Feb 18, 2025

Description

qa-wolf bot commented Feb 18, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment