-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test cron removal. #15569
Test cron removal. #15569
Conversation
QA Wolf here! As you write new code it's important that your test coverage is keeping up. |
@@ -86,10 +87,13 @@ class InMemoryQueue implements Partial<Queue> { | |||
*/ | |||
async process(concurrencyOrFunc: number | any, func?: any) { | |||
func = typeof concurrencyOrFunc === "number" ? func : concurrencyOrFunc | |||
this._emitter.on("message", async message => { | |||
this._emitter.on("message", async msg => { | |||
const message = cloneDeep(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hilariously this is what was causing those loop.spec.ts
tests to pass without timing out. The orchestrator removes appId
from the job message, which mutates the original message, causing all future invocations of that message to fail. Preserving the appId
fixes these, but causes loop.spec.ts
to then start timing out because it's doing a lot more work than before.
this._emitter.on("message", async msg => { | ||
const message = cloneDeep(msg) | ||
|
||
const isManualTrigger = (message as any).manualTrigger === true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is part of the mechanism I'm using to trigger cron messages. On app publish they get queued without this manualTrigger
property, and in that instance we ignore them. Then later when we trigger them manually, they have this property attached to them to distinguish them from the initial message. It's a bit dirty, but results in a much more realistic test.
async add(data: T | string, optsOrT?: JobOptions | T) { | ||
if (typeof data === "string") { | ||
throw new Error("doesn't support named jobs") | ||
} | ||
|
||
const opts = optsOrT as JobOptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes were required to do the correct generic typing on the class (InMemoryQueue<T>
as opposed to just InMemoryQueue
).
@@ -176,7 +186,7 @@ class InMemoryQueue implements Partial<Queue> { | |||
|
|||
async removeRepeatableByKey(id: string) { | |||
for (const [idx, message] of this._messages.entries()) { | |||
if (message.opts?.jobId?.toString() === id) { | |||
if (message.id === id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In getRepeatableJobs
we transform each message to a JobInformation
object. This sets .key
to be the same as .id
. We're relying on that fact here, as messages are not stored with a .key
property.
try { | ||
await automations.logs.storeLog(automation, results) | ||
} catch (e: any) { | ||
if (e.status === 413 && e.request?.data) { | ||
// if content is too large we shouldn't log it | ||
delete e.request.data | ||
e.request.data = { message: "removed due to large size" } | ||
} | ||
logging.logAlert("Error writing automation log", e) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This handling was moved from the automation orchestrator to here because it seems to me that we should probably always log this when a log fails because it's too big.
const { automations } = await config.api.automation.fetch() | ||
for (const automation of automations) { | ||
await config.api.automation.delete(automation) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix for the timeouts mentioned earlier in this review. This makes it such that there are fewer other automations to trigger when working with each test's specific automation.
let results: Job<AutomationData>[] = [] | ||
const removed = await captureAutomationRemovals(automation, async () => { | ||
results = await captureAutomationResults(automation, async () => { | ||
for (let i = 0; i < MAX_AUTOMATION_RECURRING_ERRORS; i++) { | ||
triggerCron(message) | ||
} | ||
}) | ||
}) | ||
|
||
expect(removed).toHaveLength(1) | ||
expect(removed[0].id).toEqual(message.id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the meat of the PR, the test that makes sure jobs are removed from Bull when they fail >5 times. This nesting of capture*
functions is a bit ugly but this is the only place we have to do it, so I'm not super worried.
export function getTestQueue(): queue.InMemoryQueue<AutomationData> { | ||
return getQueue() as unknown as queue.InMemoryQueue<AutomationData> | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what I was trying to avoid in the previous PR, but I bit the bullet and did it. I wanted to just use methods on the Bull.Queue
interface but that was holding me back from being able to manually retrigger cron messages.
return metadata.errorCount | ||
} catch (error: any) { | ||
err = error | ||
await helpers.wait(1000 + Math.random() * 1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the randomness?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When multiple things are fighting to write to a document, a cheap way to make it less likely they'll conflict is to wait a random amount of time. It's not strictly necessary but makes conflicts less likely.
Description
In #15566 I was annoyed that I wasn't able to find a nice way to test that cron automations get removed correctly from Bull. In this PR, I think I've figured out a nice enough way to do it.
It boils down to capturing the cron message that goes into the queue and exposing a method to re-trigger that message in tests. With this, we ensure that when the job triggers, it's using the correct cron message, and this then correctly triggers the removal from the queue when it fails 5 times.