Fix post-reload trigger. #5104

hjoliver · 2022-08-26T06:21:23Z

Close #5102 - retriggering a failed task after reload should not cause weird job submission errors.

Check List

I have read CONTRIBUTING.md and added my name as a Code Contributor.
Contains logically grouped changes (else tidy your branch by rebase).
Does not contain off-topic changes (use other PRs for other changes).
Applied any dependency changes to both setup.cfg and conda-environment.yml.
Tests are included (or explain why tests are not needed).
CHANGES.md entry included if this is a change that can affect users
Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
If this is a bug fix, PRs raised to both master and the relevant maintenance branch.

hjoliver · 2022-08-26T06:25:56Z

Works fine for the test case in back-compat and normal mode. I'm not sure why it doesn't affect both modes, on master.

oliver-sanders · 2022-08-26T14:28:02Z

I'm not to sure about this one, definitely don't understand it yet.

Given this issue has cropped up at least twice I'm worried that this might not be the last we see of it.

MetRonnie

This does not currently fix the problem of the second submitted job having the old pre-reload config

cylc/flow/task_job_mgr.py

hjoliver · 2022-08-29T10:52:44Z

This does not currently fix the problem of the second submitted job having the old pre-reload config

Sorry, there was a second small change I'd neglected to commit and push to GH. It fixes what I think is probably the bug causing mysterious reload issues.

(Disclaimer: not extensively tested beyond the original example yet).

MetRonnie

Ok, have confirmed Dave's example now works as expected

cylc/flow/task_pool.py

hjoliver · 2022-09-07T07:07:51Z

@oliver-sanders - I think @MetRonnie is away - can you review or reassign, and consider getting this into 8.0.2?

tests/functional/reload/27-stall-retrigger/bin/stall-handler.sh

tests/functional/reload/27-stall-retrigger.t

oliver-sanders · 2022-09-12T10:58:36Z

I'm worried that this big will keep resurfacing in different places since it's already cropped up a few times.

I took a look into re-implementing reload to update the tdef and modify the TaskProxy in place. This seems promising, however, is risky, too risky for 8.0.2, requires more thought.

I think this fix makes sense for now, from Ronnie's comment it looks like the test is not capturing the issue though?

hjoliver · 2022-09-14T05:26:03Z

I'm worried that this big will keep resurfacing in different places since it's already cropped up a few times.

It's possible that other bugs will crop up that affect reload, of course. But this PR definitely fixes this particular bug.

oliver-sanders · 2022-09-15T12:41:14Z

It's possible that other bugs will crop up that affect reload, of course. But this PR definitely fixes this particular bug.

FYI the source of this bug is exactly the same as for previous reload fixes (i.e. two copies of the task, one pre and one post reload, one modified, one not) - #5102 (comment). This issue has already surfaced and been fixed in a couple of other places, my concern is that it could continue to crop up in more contexts.

For 8.0.x this is the right fix, anything else is time consuming and risky.

For 8.1.0 we should consider changing the reload implementation to modify the TaskProxy in place which would resolve the source of this issue.

hjoliver · 2022-09-15T21:29:19Z

For 8.1.0 we should consider changing the reload implementation to modify the TaskProxy in place which would resolve the source of this issue.

Yes, point taken, but this PR fixes what is the fundamental bug in the current implementation: it sometimes caused the old task proxy not to be swapped out for the new one.

If we get a chance to reimplement for 8.1.0, good - otherwise this'll have to do.

I need to remind myself of why it is not done in place ...

oliver-sanders · 2022-09-20T15:26:25Z

From experimentation I can't see any reason for it not to be done in place, it's actually simpler.

The main thing to watch out for is cases where derived values are computed from the TaskDef and ensure they are updated by the reload.

…l-failure.bk * upstream/8.0.x: remote: ensure all remote commands use a platform config (cylc#5152) Db store force triggered (cylc#5023) Run GH Actions tests on push to `8.*.x` branches Auto bump dev version on release remote-install: add "ana/" to the default install list (cylc#5137) A no-flow task should not merge and retrigger incomplete children (cylc#5146) `log_vc_info`: Redirect diff straight to file to avoid blocking pipe (cylc#5139) fix reversed data-store edge source-target (cylc#5156) Fix Jinja2 support if HOME undefined. Assume Jinja2 might be used in global-tests.cylc. Fix post-reload trigger. (cylc#5104) Bump dev version Update changelog workflow_state xtrigger: infer run num Type annotations Prepare release 8.0.2 Remove HOME Env Variable from get_remote_workflow_run_dir (cylc#5115)

hjoliver added the bug Something is wrong :( label Aug 26, 2022

hjoliver added this to the 8.0.2 milestone Aug 26, 2022

hjoliver requested a review from dpmatthews August 26, 2022 06:21

hjoliver self-assigned this Aug 26, 2022

MetRonnie reviewed Aug 26, 2022

View reviewed changes

cylc/flow/task_job_mgr.py Show resolved Hide resolved

MetRonnie reviewed Aug 30, 2022

View reviewed changes

cylc/flow/task_pool.py Outdated Show resolved Hide resolved

MetRonnie linked an issue Aug 30, 2022 that may be closed by this pull request

Reload broken in compatibility mode #5102

Closed

hjoliver added 4 commits September 7, 2022 16:48

Fix post-reload trigger.

ac71f0e

Fix reload taskdef swap-out.

47f70c5

Address review comments.

736293e

Update change log.

8c6d2a1

hjoliver force-pushed the fix-5102 branch from e72905c to 0f3660d Compare September 7, 2022 05:52

hjoliver marked this pull request as ready for review September 7, 2022 05:53

Add func test.

28a593b

hjoliver force-pushed the fix-5102 branch from 0f3660d to 28a593b Compare September 7, 2022 06:00

hjoliver requested a review from oliver-sanders September 7, 2022 07:08

MetRonnie reviewed Sep 12, 2022

View reviewed changes

tests/functional/reload/27-stall-retrigger/bin/stall-handler.sh Outdated Show resolved Hide resolved

tests/functional/reload/27-stall-retrigger.t Show resolved Hide resolved

oliver-sanders modified the milestones: cylc-8.0.2, cylc-8.0.3 Sep 12, 2022

Rename test file, add stdout message.

42febb4

Merge branch '8.0.x' into fix-5102

3a36cd2

hjoliver mentioned this pull request Sep 14, 2022

2022 Cylc Meetings cylc/cylc-admin#143

Closed

MetRonnie approved these changes Sep 14, 2022

View reviewed changes

oliver-sanders approved these changes Sep 15, 2022

View reviewed changes

oliver-sanders merged commit 3554401 into cylc:8.0.x Sep 15, 2022

hjoliver deleted the fix-5102 branch September 15, 2022 21:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix post-reload trigger. #5104

Fix post-reload trigger. #5104

hjoliver commented Aug 26, 2022 •

edited

Loading

hjoliver commented Aug 26, 2022

oliver-sanders commented Aug 26, 2022

MetRonnie left a comment

hjoliver commented Aug 29, 2022

MetRonnie left a comment

hjoliver commented Sep 7, 2022

oliver-sanders commented Sep 12, 2022

hjoliver commented Sep 14, 2022

oliver-sanders commented Sep 15, 2022 •

edited

Loading

hjoliver commented Sep 15, 2022

oliver-sanders commented Sep 20, 2022

Fix post-reload trigger. #5104

Fix post-reload trigger. #5104

Conversation

hjoliver commented Aug 26, 2022 • edited Loading

hjoliver commented Aug 26, 2022

oliver-sanders commented Aug 26, 2022

MetRonnie left a comment

Choose a reason for hiding this comment

hjoliver commented Aug 29, 2022

MetRonnie left a comment

Choose a reason for hiding this comment

hjoliver commented Sep 7, 2022

oliver-sanders commented Sep 12, 2022

hjoliver commented Sep 14, 2022

oliver-sanders commented Sep 15, 2022 • edited Loading

hjoliver commented Sep 15, 2022

oliver-sanders commented Sep 20, 2022

hjoliver commented Aug 26, 2022 •

edited

Loading

oliver-sanders commented Sep 15, 2022 •

edited

Loading