-
Notifications
You must be signed in to change notification settings - Fork 94
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
reload: fix submission errors for jobs awaiting preparation (#4984)
* job: increment the submission number at preparation time * Addresses #4974 * Job submission number used to be incremented *after* submission (i.e. only once there is a "submission" of which to speak). * However, we also incremented the submission number if submission (or preparation) failed (in which cases there isn't really a "submission" but we need one for internal purposes). * Now the submission number is incremented when tasks enter the "preparing" state. * This resolves an issue where jobs which were going through the submission pipeline during a reload got badly broken in the scheduler (until restarted). * scheduler: re-compute pre_prep_tasks for each iteration * Addresses #4974 * Tasks which are awaiting job preparation used to be stored in `Scheduler.pre_prep_tasks`, however, this effectively created an intermediate "task pool" which had nasty interactions with reload. * This commit removes the pre_prep_tasks list by merging the listing of these tasks in with TaskPool.release_queued_tasks (to avoid unnecessary task pool iteration). * `waiting_on_job_prep` now defaults to `False` rather than `True`. * platforms: don't re-try, re-attempt submission * Previously if submission on a host fails 255 (SSH error), then we put a submission retry on it to allow the task to retry on another host We decremented the submission number to make it look like the same attempt. * Now we set the flag which sends the task back through the submission pipeline allowing it to retry without intermediate state changes. * changelog [skip ci] Co-authored-by: Tim Pillinger <[email protected]>
- Loading branch information
1 parent
ce2f4cb
commit e085c22
Showing
7 changed files
with
76 additions
and
68 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters