Add GPU Numbers Predicates #1692

peiniliu · 2021-08-23T15:31:54Z

Support specify GPU numbers for pod resource requests issue#1440

Currently, Volcano only supports specified GPU share memory. Specified GPU number is not supported. This pr supports defining GPU numbers for pod resource requests. You can check the design doc https://github.com/peiniliu/volcano/blob/dev/docs/user-guide/how_to_use_gpu_number.md for more details.

volcano-sh-bot · 2021-08-23T15:31:56Z

Welcome @peiniliu!

It looks like this is your first PR to volcano-sh/volcano 馃帀.

Thank you, and welcome to Volcano. 😃

jasonliu747 · 2021-08-24T02:39:32Z

Hi @peiniliu , thanks for your contribution. Please fix the DCO check.

william-wang · 2021-08-24T08:23:56Z

@jasonliu747 Really hope this feature can be involved into v1.4. I have not got a gpu env. do you have gpu environment for verification?

jasonliu747 · 2021-08-25T03:01:38Z

@william-wang let me double check and get back to you!

peiniliu · 2021-08-27T04:53:56Z

fix the DCO check.

Hi Jason,

thx! I've fixed the DCO.

Pls let me know if further things are needed.

Best,

Peini

shinytang6

please rebase the master and push again, this pr contains too many previous commits

william-wang · 2021-08-30T02:11:33Z

@peiniliu Rebase and submit the pr again.

peiniliu · 2021-08-30T10:29:15Z

@jasonliu747 @william-wang
Hi,
the pr has been updated!
Best,
Peini

zamog · 2021-09-09T09:08:15Z

This PR will fix #1686 ?

Thor-wl · 2021-10-15T09:15:31Z

This PR will fix #1686 ?

@peiniliu

peiniliu · 2021-10-15T09:48:51Z

this PR supports users to define GPU number predicates via container level resource requirements using the 'volcano.sh/gpu-numbers'. For provided issue, you may look at the queue and preempt or reclaim actions.

shinytang6 · 2022-07-21T11:40:10Z

docs/user-guide/how_to_use_gpu_number.md

+
+The main architecture is similar as the previous, but the gpu-index results of each pod will be a list of gpu cards index. 
+
+![gpu_number](../images/gpu-number.png)


Kebe-API-Server => Kube-API-Server

peiniliu · 2022-07-23T13:27:40Z

I fixed those comments.

shinytang6 · 2022-07-25T02:15:30Z

docs/user-guide/how_to_use_gpu_sharing.md

@@ -46,6 +46,8 @@ Same as above, after installed, update the scheduler configuration in `volcano-s

 Please refer to [volcano device plugin](https://github.com/volcano-sh/devices/blob/master/README.md#quick-start)

+* By default volcano device plugin supports shared GPUs, users do not need to config volcano device plugin. Default setting is the same as setting --gpu-strategy=number. For more information [volcano device plugin configuration](https://github.com/volcano-sh/devices/blob/dev/doc/config.md)
+


the link is 404

shinytang6

Generally LGTM.
/cc @Thor-wl @william-wang Please take another look

volcano-sh-bot · 2022-07-27T08:22:18Z

@shinytang6: GitHub didn't allow me to request PR reviews from the following users: another, look, Please, take.

Note that only volcano-sh members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

Generally LGTM.
/cc @Thor-wl @william-wang Please take another look

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Thor-wl

BTW, is the comment about initContainer resolved?

peiniliu · 2022-07-28T07:58:14Z

BTW, is the comment about initContainer resolved?

as explained, currently device plugin does not fit multiple containers. as well as the initContainer is designed for some start scripts which may not request GPUs. Well, I think the important part is the device plugin, once it supports, the GPU usage for initContainer can be added.

Thor-wl · 2022-07-28T08:19:46Z

BTW, is the comment about initContainer resolved?

as explained, currently device plugin does not fit multiple containers. as well as the initContainer is designed for some start scripts which may not request GPUs. Well, I think the important part is the device plugin, once it supports, the GPU usage for initContainer can be added.

IC, that's OK for me.

Thor-wl · 2022-07-28T08:19:53Z

/lgtm

peiniliu · 2022-07-29T10:19:45Z

I removed some unrelated info in the doc after the presentation.

Thor-wl · 2022-07-30T01:02:45Z

/lgtm

Signed-off-by: peiniliu <[email protected]>

shinytang6

/lgtm

volcano-sh-bot · 2022-08-15T07:39:17Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: shinytang6, Thor-wl

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [Thor-wl,shinytang6]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

WingkaiHo · 2022-09-07T12:03:37Z

pkg/scheduler/plugins/predicates/predicates.go

-
-				id := predicateGPU(pod, nodeInfo)
-				if id < 0 {
+				ids := predicateGPUbyMemory(pod, nodeInfo)


share gpu pod only need to one gpu id, but predicateGPUbyMemory reture all gpu id which are suitable for the pod.

volcano-sh-bot requested review from hudson741 and hzxuzhonghu August 23, 2021 15:31

volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 23, 2021

Thor-wl requested review from Thor-wl, jasonliu747 and william-wang and removed request for hudson741 and hzxuzhonghu August 24, 2021 02:49

peiniliu force-pushed the dev branch from a45949b to e6e76b7 Compare August 27, 2021 03:57

volcano-sh-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 27, 2021

shinytang6 suggested changes Aug 27, 2021

View reviewed changes

volcano-sh-bot assigned shinytang6 Aug 27, 2021

peiniliu force-pushed the dev branch from e6e76b7 to f6d8583 Compare August 30, 2021 10:26

volcano-sh-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 30, 2021

peiniliu force-pushed the dev branch 3 times, most recently from edf8719 to 6db3361 Compare August 31, 2021 10:05

Thor-wl requested a review from hwdef November 27, 2021 03:53

shinytang6 reviewed Jul 21, 2022

View reviewed changes

peiniliu force-pushed the dev branch from 11b14a0 to e736d1d Compare July 23, 2022 13:22

shinytang6 suggested changes Jul 25, 2022

View reviewed changes

peiniliu force-pushed the dev branch from e736d1d to 346ef86 Compare July 25, 2022 07:26

shinytang6 reviewed Jul 27, 2022

View reviewed changes

volcano-sh-bot requested a review from Thor-wl July 27, 2022 08:22

Thor-wl reviewed Jul 28, 2022

View reviewed changes

volcano-sh-bot assigned Thor-wl Jul 28, 2022

volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Jul 28, 2022

peiniliu force-pushed the dev branch from 346ef86 to 207ca6e Compare July 29, 2022 10:17

volcano-sh-bot removed the lgtm Indicates that a PR is ready to be merged. label Jul 29, 2022

volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Jul 30, 2022

volcano-sh-bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 12, 2022

Add predicates gpu-number.

05382d1

Signed-off-by: peiniliu <[email protected]>

peiniliu force-pushed the dev branch from 5d4812e to 05382d1 Compare August 15, 2022 07:20

Thor-wl approved these changes Aug 15, 2022

View reviewed changes

volcano-sh-bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Aug 15, 2022

shinytang6 approved these changes Aug 15, 2022

View reviewed changes

volcano-sh-bot merged commit aa84a7d into volcano-sh:master Aug 15, 2022

peiniliu mentioned this pull request Aug 22, 2022

【summer2022】Volcano Function Enhancement #2190

Closed

WingkaiHo reviewed Sep 7, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GPU Numbers Predicates #1692

Add GPU Numbers Predicates #1692

peiniliu commented Aug 23, 2021

volcano-sh-bot commented Aug 23, 2021

jasonliu747 commented Aug 24, 2021

william-wang commented Aug 24, 2021

jasonliu747 commented Aug 25, 2021

peiniliu commented Aug 27, 2021

shinytang6 left a comment

william-wang commented Aug 30, 2021

peiniliu commented Aug 30, 2021

zamog commented Sep 9, 2021

Thor-wl commented Oct 15, 2021

peiniliu commented Oct 15, 2021

shinytang6 Jul 21, 2022

peiniliu commented Jul 23, 2022

shinytang6 Jul 25, 2022

shinytang6 left a comment

volcano-sh-bot commented Jul 27, 2022

Thor-wl left a comment

peiniliu commented Jul 28, 2022

Thor-wl commented Jul 28, 2022

Thor-wl commented Jul 28, 2022

peiniliu commented Jul 29, 2022

Thor-wl commented Jul 30, 2022

shinytang6 left a comment

volcano-sh-bot commented Aug 15, 2022

WingkaiHo Sep 7, 2022


		The main architecture is similar as the previous, but the gpu-index results of each pod will be a list of gpu cards index.

		![gpu_number](../images/gpu-number.png)

		@@ -46,6 +46,8 @@ Same as above, after installed, update the scheduler configuration in `volcano-s

		Please refer to [volcano device plugin](https://github.com/volcano-sh/devices/blob/master/README.md#quick-start)

		* By default volcano device plugin supports shared GPUs, users do not need to config volcano device plugin. Default setting is the same as setting --gpu-strategy=number. For more information [volcano device plugin configuration](https://github.com/volcano-sh/devices/blob/dev/doc/config.md)

Add GPU Numbers Predicates #1692

Add GPU Numbers Predicates #1692

Conversation

peiniliu commented Aug 23, 2021

volcano-sh-bot commented Aug 23, 2021

jasonliu747 commented Aug 24, 2021

william-wang commented Aug 24, 2021

jasonliu747 commented Aug 25, 2021

peiniliu commented Aug 27, 2021

shinytang6 left a comment

Choose a reason for hiding this comment

william-wang commented Aug 30, 2021

peiniliu commented Aug 30, 2021

zamog commented Sep 9, 2021

Thor-wl commented Oct 15, 2021

peiniliu commented Oct 15, 2021

shinytang6 Jul 21, 2022

Choose a reason for hiding this comment

peiniliu commented Jul 23, 2022

shinytang6 Jul 25, 2022

Choose a reason for hiding this comment

shinytang6 left a comment

Choose a reason for hiding this comment

volcano-sh-bot commented Jul 27, 2022

Thor-wl left a comment

Choose a reason for hiding this comment

peiniliu commented Jul 28, 2022

Thor-wl commented Jul 28, 2022

Thor-wl commented Jul 28, 2022

peiniliu commented Jul 29, 2022

Thor-wl commented Jul 30, 2022

shinytang6 left a comment

Choose a reason for hiding this comment

volcano-sh-bot commented Aug 15, 2022

WingkaiHo Sep 7, 2022

Choose a reason for hiding this comment