Volcano is a batch system built on Kubernetes. It provides a suite of mechanisms currently missing from Kubernetes that are commonly required by many classes of batch & elastic workloads. With the integration with Volcano, Flink job and task managers can be scheduled simultaneously, which is particularly suitable for clusters with resource shortage.
- Install from provided demo
Run the following
kubectl apply -f https://raw.githubusercontent.com/volcano-sh/volcano/master/installer/volcano-development.yaml
- Install with advanced settings
Please refer to Volcano Official Guide
$ kubectl get pod -n volcano-system
pod/volcano-admission-75688c79bf-b8fmj 1/1 Running 0 52s
pod/volcano-admission-init-d684j 0/1 Completed 0 53s
pod/volcano-controllers-d87bdbd7c-q6ds6 1/1 Running 0 52s
pod/volcano-scheduler-5476779fd9-8rslv 1/1 Running 0 52s
Please refer to Deploy the operator to a Kubernetes cluster
Create a sample Flink job cluster with:
$ kubectl apply -f config/samples/flinkoperator_v1beta1_flinkjobcluster_volcano.yaml --validate=false
and verify the pod is up and running with
$ kubectl get pod,svc |grep flinkjobcluster
pod/flinkjobcluster-sample-job-xt4k7 1/1 Running 0 34s
pod/flinkjobcluster-sample-jobmanager-6c955f9b4-6mfvk 1/1 Running 0 65s
pod/flinkjobcluster-sample-taskmanager-77c7bb8778-hmvzm 1/1 Running 0 65s
pod/flinkjobcluster-sample-taskmanager-77c7bb8778-rp9m4 1/1 Running 0 65s
service/flinkjobcluster-sample-jobmanager ClusterIP <none> 6123/TCP,6124/TCP,6125/TCP,8081/TCP 65s
verify job manager
and task manager
are scheduled by volcano
$ kubectl get podgroup flink-flinkjobcluster-sample -oyaml
apiVersion: scheduling.volcano.sh/v1beta1
kind: PodGroup
creationTimestamp: "2020-06-29T03:39:48Z"
generation: 5
name: flink-flinkjobcluster-sample
namespace: default
- apiVersion: flinkoperator.k8s.io/v1beta1
blockOwnerDeletion: false
controller: true
kind: FlinkCluster
name: flinkjobcluster-sample
uid: dfb78a1b-6eeb-4bc8-89c5-73a8de4f53e8
resourceVersion: "70799"
selfLink: /apis/scheduling.volcano.sh/v1beta1/namespaces/default/podgroups/flink-flinkjobcluster-sample
uid: a87d9b05-7d33-4529-8e18-f7d6eb42e7aa
minMember: 3
cpu: 600m
memory: 3Gi
phase: Running
running: 3
As shown above, the podgroup has two pods in running phase and the min required number is 3, that means if the cluster has no enough resources to run both the job manager and task manager, then they are not scheduled.
Also you can check the job manager and task manager's scheduler name is now set to volcano
$ kubectl get pod flinkjobcluster-sample-jobmanager-6c955f9b4-6mfvk -ojsonpath={'.spec.schedulerName'}
$ kubectl get pod flinkjobcluster-sample-taskmanager-77c7bb8778-hmvzm -ojsonpath={'.spec.schedulerName'}
Note: the job's pod is not included in the podgroup.
you can create a sample session cluster as well with
kubectl apply -f config/samples/flinkoperator_v1beta1_flinksessioncluster_volcano.yaml --validate=false