-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GPU Numbers Predicates #1692
Conversation
Welcome @peiniliu! |
Hi @peiniliu , thanks for your contribution. Please fix the DCO check. |
@jasonliu747 Really hope this feature can be involved into v1.4. I have not got a gpu env. do you have gpu environment for verification? |
@william-wang let me double check and get back to you! |
Hi Jason, thx! I've fixed the DCO. Pls let me know if further things are needed. Best, Peini |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please rebase the master and push again, this pr contains too many previous commits
@peiniliu Rebase and submit the pr again. |
@jasonliu747 @william-wang |
edf8719
to
6db3361
Compare
This PR will fix #1686 ? |
this PR supports users to define GPU number predicates via container level resource requirements using the 'volcano.sh/gpu-numbers'. For provided issue, you may look at the queue and preempt or reclaim actions. |
|
||
The main architecture is similar as the previous, but the gpu-index results of each pod will be a list of gpu cards index. | ||
|
||
![gpu_number](../images/gpu-number.png) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kebe-API-Server => Kube-API-Server
I fixed those comments. |
@@ -46,6 +46,8 @@ Same as above, after installed, update the scheduler configuration in `volcano-s | |||
|
|||
Please refer to [volcano device plugin](https://github.com/volcano-sh/devices/blob/master/README.md#quick-start) | |||
|
|||
* By default volcano device plugin supports shared GPUs, users do not need to config volcano device plugin. Default setting is the same as setting --gpu-strategy=number. For more information [volcano device plugin configuration](https://github.com/volcano-sh/devices/blob/dev/doc/config.md) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the link is 404
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM.
/cc @Thor-wl @william-wang Please take another look
@shinytang6: GitHub didn't allow me to request PR reviews from the following users: another, look, Please, take. Note that only volcano-sh members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, is the comment about initContainer
resolved?
as explained, currently device plugin does not fit multiple containers. as well as the initContainer is designed for some start scripts which may not request GPUs. Well, I think the important part is the device plugin, once it supports, the GPU usage for initContainer can be added. |
IC, that's OK for me. |
/lgtm |
I removed some unrelated info in the doc after the presentation. |
/lgtm |
Signed-off-by: peiniliu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: shinytang6, Thor-wl The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
||
id := predicateGPU(pod, nodeInfo) | ||
if id < 0 { | ||
ids := predicateGPUbyMemory(pod, nodeInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
share gpu pod only need to one gpu id, but predicateGPUbyMemory reture all gpu id which are suitable for the pod.
Support specify GPU numbers for pod resource requests issue#1440
Currently, Volcano only supports specified GPU share memory. Specified GPU number is not supported. This pr supports defining GPU numbers for pod resource requests. You can check the design doc https://github.com/peiniliu/volcano/blob/dev/docs/user-guide/how_to_use_gpu_number.md for more details.