Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds shared memory into extended resources #6193

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

thomasjpfan
Copy link
Member

@thomasjpfan thomasjpfan commented Jan 25, 2025

Tracking issue

Towards #6142

Why are the changes needed?

This PR adds shared memory as an extend resource, that is made available through @task(shared_memory). For the simple case, you can have @task(shared_memory=True), which means: "memory backed volumes are sized to node allocatable memory". Otherwise, you can set shared_memory="2Gi" to specify the value.

What changes were proposed in this pull request?

This PR adds shared_memory to the IDL and the implementation to get it to work.

How was this patch tested?

Unit tests were added to this PR and tested with flytekit changes:

import os
from flytekit import task, ImageSpec

image = ImageSpec(
    name="flytekit",
    apt_packages=["git"],
    registry="localhost:30000",
    commands=[
        "uv pip install git+https://github.com/thomasjpfan/flyte.git@65dda339b0088d9e568877577fa78fc88b223582#subdirectory=flytekit"
        "uv pip install git+https://github.com/thomasjpfan/flyte.git@d2c76ff330077875f7826c278f660add7f2c50a9#subdirectory=flyteidl"
    ],
)


@task(container_image=image, shared_memory=True)
def check_shm2() -> bool:
    return os.path.exists("/dev/shm")

Then I used kubectl to make sure that the pod spec was correct.

Summary by Bito

This PR implements shared memory support in Flyte's extended resources by introducing a new SharedMemory message type in the IDL with configurable mount options. The implementation includes protobuf definitions and pod helper functionality for Kubernetes pods, utilizing getter methods for shared volume mount properties. The changes enable tasks to request and configure memory-backed volumes with either default node allocatable memory or custom size specifications across multiple language bindings including Go, JavaScript/TypeScript, and Python.

Unit tests added: True

Estimated effort to review (1-5, lower is better): 5

@thomasjpfan thomasjpfan added the added Merged changes that add new functionality label Jan 25, 2025
@flyte-bot
Copy link
Collaborator

flyte-bot commented Jan 25, 2025

Code Review Agent Run #6a4c77

Actionable Suggestions - 6
  • flyteplugins/go/tasks/pluginmachinery/flytek8s/pod_helper_test.go - 2
  • flyteidl/protos/flyteidl/core/tasks.proto - 1
    • Consider using resource.Quantity for size_limit · Line 73-73
  • flyteplugins/go/tasks/pluginmachinery/flytek8s/pod_helper.go - 3
    • Consider validating SharedMemory fields · Line 494-495
    • Consider validating primaryContainerName before use · Line 494-497
    • Consider explicit error handling for ApplySharedMemory · Line 495-495
Additional Suggestions - 3
  • flyteidl/gen/pb_python/flyteidl/core/tasks_pb2.pyi - 1
    • Consider adding type hints for field · Line 70-71
  • flyteplugins/go/tasks/pluginmachinery/flytek8s/pod_helper.go - 2
    • Consider using camelCase for variable names · Line 166-166
    • Consider renaming parameter to follow conventions · Line 140-140
Review Details
  • Files reviewed - 10 · Commit Range: d2c76ff..d2c76ff
    • flyteidl/gen/pb-es/flyteidl/core/tasks_pb.ts
    • flyteidl/gen/pb-go/flyteidl/core/tasks.pb.go
    • flyteidl/gen/pb-js/flyteidl.d.ts
    • flyteidl/gen/pb-js/flyteidl.js
    • flyteidl/gen/pb_python/flyteidl/core/tasks_pb2.py
    • flyteidl/gen/pb_python/flyteidl/core/tasks_pb2.pyi
    • flyteidl/gen/pb_rust/flyteidl.core.rs
    • flyteidl/protos/flyteidl/core/tasks.proto
    • flyteplugins/go/tasks/pluginmachinery/flytek8s/pod_helper.go
    • flyteplugins/go/tasks/pluginmachinery/flytek8s/pod_helper_test.go
  • Files skipped - 4
    • flyteidl/clients/go/assets/admin.swagger.json - Reason: Filter setting
    • flyteidl/gen/pb-go/gateway/flyteidl/service/admin.swagger.json - Reason: Filter setting
    • flyteidl/gen/pb-go/gateway/flyteidl/service/agent.swagger.json - Reason: Filter setting
    • flyteidl/gen/pb-go/gateway/flyteidl/service/external_plugin_service.swagger.json - Reason: Filter setting
  • Tools
    • Golangci-lint (Linter) - ✖︎ Failed
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful

AI Code Review powered by Bito Logo

Copy link

codecov bot commented Jan 25, 2025

Codecov Report

Attention: Patch coverage is 32.07547% with 108 lines in your changes missing coverage. Please review.

Project coverage is 37.08%. Comparing base (45ce4c0) to head (04206f8).

Files with missing lines Patch % Lines
flyteidl/gen/pb-go/flyteidl/core/tasks.pb.go 2.83% 103 Missing ⚠️
...ns/go/tasks/pluginmachinery/flytek8s/pod_helper.go 90.56% 4 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           master    #6193    +/-   ##
========================================
  Coverage   37.08%   37.08%            
========================================
  Files        1318     1318            
  Lines      132707   132811   +104     
========================================
+ Hits        49208    49256    +48     
- Misses      79244    79299    +55     
- Partials     4255     4256     +1     
Flag Coverage Δ
unittests-datacatalog 51.58% <ø> (ø)
unittests-flyteadmin 54.34% <ø> (ø)
unittests-flytecopilot 30.99% <ø> (ø)
unittests-flytectl 62.29% <ø> (ø)
unittests-flyteidl 7.22% <2.83%> (-0.01%) ⬇️
unittests-flyteplugins 53.97% <90.56%> (+0.10%) ⬆️
unittests-flytepropeller 42.73% <ø> (ø)
unittests-flytestdlib 55.35% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Thomas J. Fan <[email protected]>
@flyte-bot
Copy link
Collaborator

flyte-bot commented Jan 25, 2025

Changelist by Bito

This pull request implements the following key changes.

Key Change Files Impacted
New Feature - Shared Memory Support in Extended Resources

tasks_pb.ts - Added SharedMemory class with mount path, name and size limit configuration

tasks.pb.go - Implemented SharedMemory protobuf message type and updated ExtendedResources to include shared memory support

New Feature - Shared Memory Support in Extended Resources

tasks.pb.go - Updated protobuf message types and added SharedMemory support in ExtendedResources

flyteidl.d.ts - Added TypeScript definitions for SharedMemory configuration and ExtendedResources integration

flyteidl.js - Implemented JavaScript bindings for SharedMemory and ExtendedResources functionality

tasks_pb2.py - Updated Python protobuf descriptors to include SharedMemory support

tasks_pb2.pyi - Added Python type hints for SharedMemory configuration

New Feature - Shared Memory Support in Extended Resources

tasks_pb2.pyi - Added shared memory field and initialization parameters

flyteidl.core.rs - Implemented SharedMemory struct and added to ExtendedResources

tasks.proto - Defined SharedMemory message type with mount path, name and size limit fields

pod_helper.go - Added shared memory volume configuration logic for Kubernetes pods

pod_helper_test.go - Added comprehensive test coverage for shared memory functionality

tasks_pb.ts - Added SharedMemory class with mount path, name and size limit configuration

tasks.pb.go - Updated protobuf message types and added SharedMemory support in ExtendedResources

flyteidl.d.ts - Added TypeScript definitions for SharedMemory configuration

flyteidl.js - Implemented JavaScript bindings for SharedMemory functionality

tasks_pb2.py - Updated Python protobuf descriptors to include SharedMemory support

errorMsg: "/dev/shm is already mounted in container",
},
{
name: "Mount path already in container",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider more descriptive test case name

The test case name Mount path already in container appears to be duplicated at line 2483. Consider using a more descriptive name that reflects the actual test scenario of invalid size limit parsing.

Code suggestion
Check the AI-generated fix before applying
Suggested change
name: "Mount path already in container",
name: "Invalid size limit format",

Code Review Run #6a4c77


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Comment on lines +2552 to +2556
var quantity resource.Quantity
if test.sharedVolume.GetSizeLimit() != "" {
quantity, err = resource.ParseQuantity(test.sharedVolume.GetSizeLimit())
assert.NoError(t, err)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider initializing quantity variable before use

Consider initializing quantity to a zero value before the conditional block. Currently if GetSizeLimit() returns an empty string, quantity remains uninitialized when used in the assertion.

Code suggestion
Check the AI-generated fix before applying
Suggested change
var quantity resource.Quantity
if test.sharedVolume.GetSizeLimit() != "" {
quantity, err = resource.ParseQuantity(test.sharedVolume.GetSizeLimit())
assert.NoError(t, err)
}
var quantity resource.Quantity
quantity = resource.Quantity{}
if test.sharedVolume.GetSizeLimit() != "" {
quantity, err = resource.ParseQuantity(test.sharedVolume.GetSizeLimit())
assert.NoError(t, err)
}

Code Review Run #6a4c77


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

// Size limit for shared memory. If not set, then the shared memory is equal
// to the allocated memory.
// +optional
string size_limit = 3;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using resource.Quantity for size_limit

Consider using a more specific type like k8s.io/apimachinery/pkg/api/resource.Quantity for size_limit instead of string to ensure proper validation of memory size values.

Code suggestion
Check the AI-generated fix before applying
Suggested change
string size_limit = 3;
k8s.io.apimachinery.pkg.api.resource.Quantity size_limit = 3;

Code Review Run #6a4c77


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Comment on lines +494 to +495
if extendedResources.GetSharedMemory() != nil {
err = ApplySharedMemory(podSpec, primaryContainerName, extendedResources.GetSharedMemory())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider validating SharedMemory fields

Consider adding validation for SharedMemory fields before applying the override. The MountPath and MountName fields should be validated to ensure they contain valid values.

Code suggestion
Check the AI-generated fix before applying
Suggested change
if extendedResources.GetSharedMemory() != nil {
err = ApplySharedMemory(podSpec, primaryContainerName, extendedResources.GetSharedMemory())
if extendedResources.GetSharedMemory() != nil {
shm := extendedResources.GetSharedMemory()
if shm.MountPath == "" || shm.MountName == "" {
return nil, nil, fmt.Errorf("shared memory mount path and name must be specified")
}
if !strings.HasPrefix(shm.MountPath, "/") {
return nil, nil, fmt.Errorf("shared memory mount path must be absolute")
}
err = ApplySharedMemory(podSpec, primaryContainerName, extendedResources.GetSharedMemory())

Code Review Run #6a4c77


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

Comment on lines +494 to +497
if extendedResources.GetSharedMemory() != nil {
err = ApplySharedMemory(podSpec, primaryContainerName, extendedResources.GetSharedMemory())
if err != nil {
return nil, nil, err
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider validating primaryContainerName before use

Consider checking if primaryContainerName is empty before using it in ApplySharedMemory(). An empty container name could cause issues with shared memory configuration.

Code suggestion
Check the AI-generated fix before applying
Suggested change
if extendedResources.GetSharedMemory() != nil {
err = ApplySharedMemory(podSpec, primaryContainerName, extendedResources.GetSharedMemory())
if err != nil {
return nil, nil, err
if extendedResources.GetSharedMemory() != nil {
if primaryContainerName == "" {
return nil, nil, fmt.Errorf("primary container name cannot be empty when configuring shared memory")
}
err = ApplySharedMemory(podSpec, primaryContainerName, extendedResources.GetSharedMemory())
if err != nil {
return nil, nil, err

Code Review Run #6a4c77


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@@ -429,6 +490,14 @@
ApplyGPUNodeSelectors(podSpec, extendedResources.GetGpuAccelerator())
}

// Shared memory volume
if extendedResources.GetSharedMemory() != nil {
err = ApplySharedMemory(podSpec, primaryContainerName, extendedResources.GetSharedMemory())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider explicit error handling for ApplySharedMemory

Consider checking the return value from ApplySharedMemory() before proceeding. The error handling could be more explicit.

Code suggestion
Check the AI-generated fix before applying
Suggested change
err = ApplySharedMemory(podSpec, primaryContainerName, extendedResources.GetSharedMemory())
if err := ApplySharedMemory(podSpec, primaryContainerName, extendedResources.GetSharedMemory()); err != nil {
return nil, nil, fmt.Errorf("failed to apply shared memory: %w", err)
}

Code Review Run #6a4c77


Is this a valid issue, or was it incorrectly flagged by the Agent?

  • it was incorrectly flagged

@flyte-bot
Copy link
Collaborator

flyte-bot commented Jan 25, 2025

Code Review Agent Run #5bb8be

Actionable Suggestions - 0
Review Details
  • Files reviewed - 1 · Commit Range: d2c76ff..04206f8
    • flyteplugins/go/tasks/pluginmachinery/flytek8s/pod_helper_test.go
  • Files skipped - 0
  • Tools
    • Golangci-lint (Linter) - ✖︎ Failed
    • Whispers (Secret Scanner) - ✔︎ Successful
    • Detect-secrets (Secret Scanner) - ✔︎ Successful

AI Code Review powered by Bito Logo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
added Merged changes that add new functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants