-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UI does not support ability to Start/Restart failed Allocation and tasks #9881
Comments
I am definitely seeing what you're seeing, but I think this is by design. If you attempt this same workflow from the CLI, you'll get the error message I suspect the original design is that if a task isn't running, it shouldn't be resurrected like this. Once a task is terminal, it is always terminal and the scheduler is free to use these resources elsewhere. Side-stepping this axiom has implications for rescheduling behavior, preemption behavior, and generally scheduling. To avoid this, you have two options:
This isn't exactly my area of expertise, so I want to verify that this is indeed the intended design before closing. |
Alright, after chatting with @cgbaker, this is indeed not a supported use case. I hope these two alternative options help you out. If it still feels like something is missing please feel free to express the workflow you're looking for here. |
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
Output from
nomad version
Nomad v1.0.1
Operating system and Environment details
CentOS 8
Issue
While allocation and task are in running state there is a way to stop/restart allocation or restart task from the UI. However when allocation is in failed state (max attempts have been reached), there is no way to start/restart a failed allocation and task from the UI. only work around is to mark nomad client as ineligible and toggle back to eligible to restart allocation or to stop/start the whole job.
Reproduction steps
Run job with a failing task, let allocation failed after max attempt of restart have been reached. allocation will be in failed state, with no start/restart options from the UI.
Our intention is to use Nomad to manage or services using raw_exec driver as a replacement of systemd.
Job file (if appropriate)
here is an example of the job file we are using with some generic names
`job "myJob" {
datacenters = ["dc1"]
type = "system"
group "myGroup" {
constraint {
attribute = "${meta.nodeId}"
value = "node1"
}
}
}`
screenshot of Running alloc/Failing alloc


screenshot of Running / Failing task


The text was updated successfully, but these errors were encountered: