Skip to content

Commit

Permalink
Only trigger job failed to start once
Browse files Browse the repository at this point in the history
Trigger the "job failed to start" state only when the
first process to do so reports. This avoids a "bounce"
effect that causes the job object to be multiply
released.

Signed-off-by: Ralph Castain <[email protected]>
(cherry picked from commit a386514)
  • Loading branch information
rhc54 committed Feb 25, 2024
1 parent f1a4222 commit 7b80594
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/mca/errmgr/dvm/errmgr_dvm.c
Original file line number Diff line number Diff line change
Expand Up @@ -488,14 +488,14 @@ static void proc_errors(int fd, short args, void *cbdata)
PRTE_FLAG_SET(jdata, PRTE_JOB_FLAG_ABORTED);
/* kill the job */
_terminate_job(jdata->nspace);
PRTE_ACTIVATE_JOB_STATE(jdata, PRTE_JOB_STATE_FAILED_TO_START);
}
/* if this was a daemon, report it */
if (PMIX_CHECK_NSPACE(jdata->nspace, PRTE_PROC_MY_NAME->nspace)) {
/* output a message indicating we failed to launch a daemon */
pmix_show_help("help-errmgr-base.txt", "failed-daemon-launch",
true, prte_tool_basename);
}
PRTE_ACTIVATE_JOB_STATE(jdata, PRTE_JOB_STATE_FAILED_TO_START);
break;

case PRTE_PROC_STATE_CALLED_ABORT:
Expand Down

0 comments on commit 7b80594

Please sign in to comment.