Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure ray pod runtime class based on custom pod specs #6199

Merged
merged 1 commit into from
Jan 29, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion flyteplugins/go/tasks/plugins/k8s/ray/ray.go
Original file line number Diff line number Diff line change
Expand Up @@ -540,12 +540,15 @@ func mergeCustomPodSpec(primaryContainer *v1.Container, podSpec *v1.PodSpec, k8s
continue
}

// Just handle resources for now
if len(container.Resources.Requests) > 0 || len(container.Resources.Limits) > 0 {
primaryContainer.Resources = container.Resources
}
}

if customPodSpec.RuntimeClassName != nil {
podSpec.RuntimeClassName = customPodSpec.RuntimeClassName
}
Comment on lines +548 to +550
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider validating RuntimeClassName before assignment

Consider adding validation for RuntimeClassName before assignment. The value should be checked for compliance with Kubernetes naming conventions.

Code suggestion
Check the AI-generated fix before applying
Suggested change
if customPodSpec.RuntimeClassName != nil {
podSpec.RuntimeClassName = customPodSpec.RuntimeClassName
}
if customPodSpec.RuntimeClassName != nil {
if validateRuntimeClassName(*customPodSpec.RuntimeClassName) {
podSpec.RuntimeClassName = customPodSpec.RuntimeClassName
}
}

Code Review Run #97d248


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged


return podSpec, nil
}

Expand Down
58 changes: 45 additions & 13 deletions flyteplugins/go/tasks/plugins/k8s/ray/ray_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -454,15 +454,17 @@ func TestBuildResourceRayCustomK8SPod(t *testing.T) {
expectedWorkerResources, err := flytek8s.ToK8sResourceRequirements(workerResources)
require.NoError(t, err)

headPodSpec := &corev1.PodSpec{
nvidiaRuntimeClassName := "nvidia-cdi"

headPodSpecCustomResources := &corev1.PodSpec{
Comment on lines +457 to +459
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving runtime class name constant

Consider moving the nvidiaRuntimeClassName constant to a package-level constant since it appears to be a reusable configuration value.

Code suggestion
Check the AI-generated fix before applying
  	serviceAccount = "ray_sa"
 +	nvidiaRuntimeClassName = "nvidia-cdi"
  )
 @@ -457,3 +457,1 @@
 -	nvidiaRuntimeClassName := "nvidia-cdi"
 -
 -	headPodSpecCustomResources := &corev1.PodSpec{

Code Review Run #97d248


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Containers: []corev1.Container{
{
Name: "ray-head",
Resources: *expectedHeadResources,
},
},
}
workerPodSpec := &corev1.PodSpec{
workerPodSpecCustomResources := &corev1.PodSpec{
Containers: []corev1.Container{
{
Name: "ray-worker",
Expand All @@ -471,14 +473,24 @@ func TestBuildResourceRayCustomK8SPod(t *testing.T) {
},
}

headPodSpecCustomRuntimeClass := &corev1.PodSpec{
RuntimeClassName: &nvidiaRuntimeClassName,
}
workerPodSpecCustomRuntimeClass := &corev1.PodSpec{
RuntimeClassName: &nvidiaRuntimeClassName,
}

params := []struct {
name string
taskResources *corev1.ResourceRequirements
headK8SPod *core.K8SPod
workerK8SPod *core.K8SPod
expectedSubmitterResources *corev1.ResourceRequirements
expectedHeadResources *corev1.ResourceRequirements
expectedWorkerResources *corev1.ResourceRequirements
name string
taskResources *corev1.ResourceRequirements
headK8SPod *core.K8SPod
workerK8SPod *core.K8SPod
expectedSubmitterResources *corev1.ResourceRequirements
expectedHeadResources *corev1.ResourceRequirements
expectedWorkerResources *corev1.ResourceRequirements
expectedSubmitterRuntimeClassName *string
expectedHeadRuntimeClassName *string
expectedWorkerRuntimeClassName *string
}{
{
name: "task resources",
Expand All @@ -491,15 +503,30 @@ func TestBuildResourceRayCustomK8SPod(t *testing.T) {
name: "custom worker and head resources",
taskResources: resourceRequirements,
headK8SPod: &core.K8SPod{
PodSpec: transformStructToStructPB(t, headPodSpec),
PodSpec: transformStructToStructPB(t, headPodSpecCustomResources),
},
workerK8SPod: &core.K8SPod{
PodSpec: transformStructToStructPB(t, workerPodSpec),
PodSpec: transformStructToStructPB(t, workerPodSpecCustomResources),
},
expectedSubmitterResources: resourceRequirements,
expectedHeadResources: expectedHeadResources,
expectedWorkerResources: expectedWorkerResources,
},
{
name: "custom runtime class name",
taskResources: resourceRequirements,
expectedSubmitterResources: resourceRequirements,
expectedHeadResources: resourceRequirements,
expectedWorkerResources: resourceRequirements,
headK8SPod: &core.K8SPod{
PodSpec: transformStructToStructPB(t, headPodSpecCustomRuntimeClass),
},
workerK8SPod: &core.K8SPod{
PodSpec: transformStructToStructPB(t, workerPodSpecCustomRuntimeClass),
},
expectedHeadRuntimeClassName: &nvidiaRuntimeClassName,
expectedWorkerRuntimeClassName: &nvidiaRuntimeClassName,
},
}

for _, p := range params {
Expand Down Expand Up @@ -531,18 +558,23 @@ func TestBuildResourceRayCustomK8SPod(t *testing.T) {
&submitterPodResources,
)

headPodResources := rayJob.Spec.RayClusterSpec.HeadGroupSpec.Template.Spec.Containers[0].Resources
headPodSpec := rayJob.Spec.RayClusterSpec.HeadGroupSpec.Template.Spec
headPodResources := headPodSpec.Containers[0].Resources
Comment on lines +561 to +562
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider extracting pod spec access pattern

Consider extracting the pod spec and resources access into a helper function to avoid code duplication. Similar pattern is repeated for both head and worker pods.

Code suggestion
Check the AI-generated fix before applying
 -			headPodSpec := rayJob.Spec.RayClusterSpec.HeadGroupSpec.Template.Spec
 -			headPodResources := headPodSpec.Containers[0].Resources
 +			headPodSpec, headPodResources := getPodSpecAndResources(rayJob.Spec.RayClusterSpec.HeadGroupSpec.Template)
 @@ -571,2 +571,1 @@
 -				workerPodSpec := workerGroupSpec.Template.Spec
 -				workerPodResources := workerPodSpec.Containers[0].Resources
 +				workerPodSpec, workerPodResources := getPodSpecAndResources(workerGroupSpec.Template)

Code Review Run #97d248


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

assert.EqualValues(t,
p.expectedHeadResources,
&headPodResources,
)

assert.EqualValues(t, p.expectedHeadRuntimeClassName, headPodSpec.RuntimeClassName)

for _, workerGroupSpec := range rayJob.Spec.RayClusterSpec.WorkerGroupSpecs {
workerPodResources := workerGroupSpec.Template.Spec.Containers[0].Resources
workerPodSpec := workerGroupSpec.Template.Spec
workerPodResources := workerPodSpec.Containers[0].Resources
assert.EqualValues(t,
p.expectedWorkerResources,
&workerPodResources,
)
assert.EqualValues(t, p.expectedWorkerRuntimeClassName, workerPodSpec.RuntimeClassName)
}
})
}
Expand Down
Loading