-
Notifications
You must be signed in to change notification settings - Fork 1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2149 from flyhighzy/preempt-stable-time
add cooldown protection plugin
- Loading branch information
Showing
10 changed files
with
496 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,201 @@ | ||
# Cooldown Protection Plugin User Guide | ||
|
||
## Background | ||
When we need to enable elastic training or serving, preemptible job's pods can be preempted or back to running repeatedly, if no cooldown protection set, these pods can be preempted again after they just started for a short time, this may cause service stability dropped. | ||
So we add "cdp" plugin to ensure preemptible job's pods can run for at least some time set by user. | ||
|
||
## Environment setup | ||
|
||
### Install volcano | ||
|
||
Refer to [Install Guide](../../installer/README.md) to install volcano. | ||
|
||
### Update scheduler configmap | ||
|
||
After installed, update the scheduler configuration: | ||
|
||
```shell | ||
kubectl edit configmap -n volcano-system volcano-scheduler-configmap | ||
``` | ||
|
||
Register `cdp` plugin in configmap while enable `preempt` action | ||
|
||
```yaml | ||
kind: ConfigMap | ||
apiVersion: v1 | ||
metadata: | ||
name: volcano-scheduler-configmap | ||
namespace: volcano-system | ||
data: | ||
volcano-scheduler.conf: | | ||
actions: "enqueue, allocate, preempt, backfill" | ||
tiers: | ||
- plugins: | ||
- name: priority | ||
- name: gang | ||
- name: conformance | ||
- name: cdp | ||
- plugins: | ||
- name: drf | ||
- name: predicates | ||
- name: task-topology | ||
arguments: | ||
task-topology.weight: 10 | ||
- name: proportion | ||
- name: nodeorder | ||
- name: binpack | ||
``` | ||
### Running Jobs | ||
Take a simple volcano job as sample. | ||
original job yaml is as below, which has "ps" and "worker" task | ||
```yaml | ||
apiVersion: batch.volcano.sh/v1alpha1 | ||
kind: Job | ||
metadata: | ||
name: test-job | ||
spec: | ||
minAvailable: 3 | ||
schedulerName: volcano | ||
priorityClassName: high-priority | ||
plugins: | ||
ssh: [] | ||
env: [] | ||
svc: [] | ||
maxRetry: 5 | ||
queue: default | ||
volumes: | ||
- mountPath: "/myinput" | ||
- mountPath: "/myoutput" | ||
volumeClaimName: "testvolumeclaimname" | ||
volumeClaim: | ||
accessModes: [ "ReadWriteOnce" ] | ||
storageClassName: "my-storage-class" | ||
resources: | ||
requests: | ||
storage: 1Gi | ||
tasks: | ||
- replicas: 6 | ||
name: "worker" | ||
template: | ||
metadata: | ||
name: worker | ||
spec: | ||
containers: | ||
- image: nginx | ||
imagePullPolicy: IfNotPresent | ||
name: nginx | ||
resources: | ||
requests: | ||
cpu: "1" | ||
restartPolicy: OnFailure | ||
- replicas: 2 | ||
name: "ps" | ||
template: | ||
metadata: | ||
name: ps | ||
spec: | ||
containers: | ||
- image: nginx | ||
imagePullPolicy: IfNotPresent | ||
name: nginx | ||
resources: | ||
requests: | ||
cpu: "1" | ||
restartPolicy: OnFailure | ||
|
||
``` | ||
|
||
#### Edit yaml of vcjob | ||
|
||
1. add annotations in volcano job in format below. | ||
1. `volcano.sh/preemptable` annotation indicates that job or task is preemptable | ||
2. `volcano.sh/cooldown-time` annotation indicates cooldown time for the entire job or dedicated task. Value for the annotation indicates cooldown time, valid time units are "ns", "us" (or "µs"), "ms", "s", "m", "h". | ||
|
||
```yaml | ||
volcano.sh/preemptable: "true" | ||
volcano.sh/cooldown-time: "600s" | ||
``` | ||
**Example 1** | ||
Add annotation to entire job, then "ps" and "worker" task can be preempted and all have cooldown time support. | ||
```yaml | ||
apiVersion: batch.volcano.sh/v1alpha1 | ||
kind: Job | ||
metadata: | ||
name: test-job | ||
annotations: | ||
volcano.sh/preemptable: "true" | ||
volcano.sh/cooldown-time: "600s" | ||
spec: | ||
... # below keep the same | ||
``` | ||
**Example 2** | ||
Add annotation to dedicated task, as shown below, only "worker" can be preempted and have cooldown time support. | ||
```yaml | ||
apiVersion: batch.volcano.sh/v1alpha1 | ||
kind: Job | ||
metadata: | ||
name: test-job | ||
spec: | ||
minAvailable: 3 | ||
schedulerName: volcano | ||
priorityClassName: high-priority | ||
plugins: | ||
ssh: [] | ||
env: [] | ||
svc: [] | ||
maxRetry: 5 | ||
queue: default | ||
volumes: | ||
- mountPath: "/myinput" | ||
- mountPath: "/myoutput" | ||
volumeClaimName: "testvolumeclaimname" | ||
volumeClaim: | ||
accessModes: [ "ReadWriteOnce" ] | ||
storageClassName: "my-storage-class" | ||
resources: | ||
requests: | ||
storage: 1Gi | ||
tasks: | ||
- replicas: 6 | ||
name: "worker" | ||
template: | ||
metadata: | ||
name: worker | ||
annotations: # add annotation in tasks | ||
volcano.sh/preemptable: "true" | ||
volcano.sh/cooldown-time: "600s" | ||
spec: | ||
containers: | ||
- image: nginx | ||
imagePullPolicy: IfNotPresent | ||
name: nginx | ||
resources: | ||
requests: | ||
cpu: "1" | ||
restartPolicy: OnFailure | ||
- replicas: 2 | ||
name: "ps" | ||
template: | ||
metadata: | ||
name: ps | ||
spec: | ||
containers: | ||
- image: nginx | ||
imagePullPolicy: IfNotPresent | ||
name: nginx | ||
resources: | ||
requests: | ||
cpu: "1" | ||
restartPolicy: OnFailure | ||
|
||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
/* | ||
Copyright 2022 The Volcano Authors. | ||
Licensed under the Apache License, Version 2.0 (the "License"); | ||
you may not use this file except in compliance with the License. | ||
You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software | ||
distributed under the License is distributed on an "AS IS" BASIS, | ||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
See the License for the specific language governing permissions and | ||
limitations under the License. | ||
*/ | ||
|
||
package cdp | ||
|
||
import ( | ||
"time" | ||
|
||
v1 "k8s.io/api/core/v1" | ||
"k8s.io/klog" | ||
|
||
"volcano.sh/apis/pkg/apis/scheduling/v1beta1" | ||
"volcano.sh/volcano/pkg/scheduler/api" | ||
"volcano.sh/volcano/pkg/scheduler/framework" | ||
"volcano.sh/volcano/pkg/scheduler/plugins/util" | ||
) | ||
|
||
const ( | ||
// refer to issue https://github.com/volcano-sh/volcano/issues/2075, | ||
// plugin cdp means cooldown protection, related to elastic scheduler, | ||
// when we need to enable elastic training or serving, | ||
// preemptible job's pods can be preempted or back to running repeatedly, | ||
// if no cooldown protection set, these pods can be preempted again after they just started for a short time, | ||
// this may cause service stability dropped. | ||
// cdp plugin here is to ensure vcjob's pods cannot be preempted within cooldown protection conditions. | ||
// currently cdp plugin only support cooldown time protection. | ||
PluginName = "cdp" | ||
) | ||
|
||
type CooldownProtectionPlugin struct { | ||
} | ||
|
||
// New return CooldownProtectionPlugin | ||
func New(arguments framework.Arguments) framework.Plugin { | ||
return &CooldownProtectionPlugin{} | ||
} | ||
|
||
// Name implements framework.Plugin | ||
func (*CooldownProtectionPlugin) Name() string { | ||
return PluginName | ||
} | ||
|
||
func (sp *CooldownProtectionPlugin) podCooldownTime(pod *v1.Pod) (value time.Duration, enabled bool) { | ||
// check labels and annotations | ||
v, ok := pod.Labels[v1beta1.CooldownTime] | ||
if !ok { | ||
v, ok = pod.Annotations[v1beta1.CooldownTime] | ||
if !ok { | ||
return 0, false | ||
} | ||
} | ||
vi, err := time.ParseDuration(v) | ||
if err != nil { | ||
klog.Warningf("invalid time duration %s=%s", v1beta1.CooldownTime, v) | ||
return 0, false | ||
} | ||
return vi, true | ||
} | ||
|
||
// OnSessionOpen implements framework.Plugin | ||
func (sp *CooldownProtectionPlugin) OnSessionOpen(ssn *framework.Session) { | ||
preemptableFn := func(preemptor *api.TaskInfo, preemptees []*api.TaskInfo) ([]*api.TaskInfo, int) { | ||
var victims []*api.TaskInfo | ||
for _, preemptee := range preemptees { | ||
cooldownTime, enabled := sp.podCooldownTime(preemptee.Pod) | ||
if !enabled { | ||
victims = append(victims, preemptee) | ||
continue | ||
} | ||
pod := preemptee.Pod | ||
// find the time of pod really transform to running | ||
// only running pod check stable time, others all put into victims | ||
stableFiltered := false | ||
if pod.Status.Phase == v1.PodRunning { | ||
// ensure pod is running and have ready state | ||
for _, c := range pod.Status.Conditions { | ||
if c.Type == v1.PodScheduled && c.Status == v1.ConditionTrue { | ||
if c.LastTransitionTime.Add(cooldownTime).After(time.Now()) { | ||
stableFiltered = true | ||
} | ||
break | ||
} | ||
} | ||
} | ||
if !stableFiltered { | ||
victims = append(victims, preemptee) | ||
} | ||
} | ||
|
||
klog.V(4).Infof("Victims from cdp plugins are %+v", victims) | ||
return victims, util.Permit | ||
} | ||
|
||
klog.V(4).Info("plugin cdp session open") | ||
ssn.AddPreemptableFn(sp.Name(), preemptableFn) | ||
} | ||
|
||
// OnSessionClose implements framework.Plugin | ||
func (*CooldownProtectionPlugin) OnSessionClose(ssn *framework.Session) {} |
Oops, something went wrong.