Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MVP] add resourcequota plugin in scheduler-estimator: add resourcequota plugin #4566

Merged
merged 1 commit into from
Jan 28, 2024
Merged

[MVP] add resourcequota plugin in scheduler-estimator: add resourcequota plugin #4566

merged 1 commit into from
Jan 28, 2024

Conversation

wengyao04
Copy link
Contributor

@wengyao04 wengyao04 commented Jan 19, 2024

What type of PR is this?
This is the followup PR of #4534

The resourceQuotaEstimator plugin will include

  • Add resourceQuotaEstimator plugin
    • It is an alpha feature, default to false in the featureGates
    • The resourceQuotaEstimator only supports
      • compute resources (cpu/memory), extended resources (like gpu) and storage resource (ephemeral-storage)
      • priorityClass scope selector
    • add two optional fields Namespace and PriorityClassName in pkg/estimator/pb/types.go
          // ReplicaRequirements represents the requirements required by each replica.
      type ReplicaRequirements struct {
          // NodeClaim represents the NodeAffinity, NodeSelector and Tolerations required by each replica.
          // +optional
          NodeClaim *NodeClaim `json:"nodeClaim,omitempty" protobuf:"bytes,1,opt,name=nodeClaim"`
          // ResourceRequest represents the resources required by each replica.
          // +optional
          ResourceRequest corev1.ResourceList `json:"resourceRequest,omitempty" protobuf:"bytes,2,rep,name=resourceRequest,casttype=k8s.io/api/core/v1.ResourceList,castkey=k8s.io/api/core/v1.ResourceName"`
          // +optional
          Namespace string `json:"namespace,omitempty" protobuf:"bytes,3,opt,name=namespace"`
          // +optional
          PriorityClassName string `json:"priorityClassName,omitempty" protobuf:"bytes,4,opt,name=priorityClassName"`
      }
      
    • add two optional fields Namespace and PriorityClassName in pkg/apis/work/v1alpha2/binding_types.go
          // ReplicaRequirements represents the requirements required by each replica.
      type ReplicaRequirements struct {
          // NodeClaim represents the node claim HardNodeAffinity, NodeSelector and Tolerations required by each replica.
          // +optional
          NodeClaim *NodeClaim `json:"nodeClaim,omitempty"`
      
          // ResourceRequest represents the resources required by each replica.
          // +optional
          ResourceRequest corev1.ResourceList `json:"resourceRequest,omitempty"`
      
          // Namespace represents the resources namespaces
          // +optional
          Namespace string `json:"namespace,omitempty"`
      
          // PriorityClassName represents the resources priorityClassName
          // +optional
          PriorityClassName string `json:"priorityClassName,omitempty"`
      }
      

We discuss if we need to add duplicate namespace filed workv1alpha2.ReplicaRequirements. I add it because we also need that in the resource interpreter https://github.com/karmada-io/karmada/blob/master/pkg/resourceinterpreter/interpreter.go#L46-L47
if we don't include the namespace in workv1alpha2.ReplicaRequirements, I have to make this interface change to return namespace.

// GetReplicas returns the desired replicas of the object as well as the requirements of each replica.
GetReplicas(object *unstructured.Unstructured) (replica int32, replicaRequires *workv1alpha2.ReplicaRequirements, err error)

What this PR does / why we need it:

Which issue(s) this PR fixes:
Fixes #
Fixes ##4369

Special notes for your reviewer:
@RainbowMango, @Garrybest, @chaosi-zju could you help check ? Thx

Does this PR introduce a user-facing change?:


@karmada-bot karmada-bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jan 19, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jan 19, 2024

Codecov Report

Attention: 49 lines in your changes are missing coverage. Please review.

Comparison is base (3cf27cc) 51.71% compared to head (e0b3e83) 51.94%.

Files Patch % Lines
...r/framework/plugins/resourcequota/resourcequota.go 79.42% 34 Missing and 9 partials ⚠️
pkg/util/helper/binding.go 25.00% 5 Missing and 1 partial ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4566      +/-   ##
==========================================
+ Coverage   51.71%   51.94%   +0.22%     
==========================================
  Files         247      248       +1     
  Lines       24419    24634     +215     
==========================================
+ Hits        12629    12795     +166     
- Misses      11103    11142      +39     
- Partials      687      697      +10     
Flag Coverage Δ
unittests 51.94% <77.41%> (+0.22%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@Garrybest Garrybest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I am not so familiar with resource quota, kindly invite @RainbowMango and @XiShanYongYe-Chang to take a look as well.


// Namespace represents the resources namespaces
// +optional
Namespace string `json:"namespace,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated with ObjectReference.Namespace, why not share? Did I miss something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is because the ReplicaEstimator interface shares the same ReplicaRequirements struct with ResourceBinding:

MaxAvailableReplicas(ctx context.Context, clusters []*clusterv1alpha1.Cluster, replicaRequirements *workv1alpha2.ReplicaRequirements) ([]workv1alpha2.TargetCluster, error)

I also noticed this issue at #4534 (comment).

Another reason just as mentioned in PR description:

We discuss if we need to add duplicate namespace filed workv1alpha2.ReplicaRequirements. I add it because we also need that in the resource interpreter https://github.com/karmada-io/karmada/blob/master/pkg/resourceinterpreter/interpreter.go#L46-L47
if we don't include the namespace in workv1alpha2.ReplicaRequirements, I have to make this interface change to return namespace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue is now tracked by #4578.

NodeClaim: nodeClaim,
ResourceRequest: resourceRequest,
Namespace: podTemplate.Namespace,
PriorityClassName: podTemplate.Spec.PriorityClassName,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember priority class name can be left empty in pod template and K8s pod admission hook will set the default priority class name for pod. So this name could be empty.

Do we handle this default value in the plugin?

Copy link
Contributor Author

@wengyao04 wengyao04 Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this MVP, we probably not, otherwise I need to listWatch PriorityClass in the karmada controller. Besides that, I think client should explicitly specify the priorityClass because the default global priorityClass might be different among managed clusters

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

priorityClassName will be empty

Then the plugin will be skipped for those without priorityClassName, right?
That's kind of a limitation, we can do that for this MVP.

Copy link
Contributor Author

@wengyao04 wengyao04 Jan 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides that, I think client should explicitly specify the priorityClass because the default global priorityClass might be different among managed clusters

But we can't determine the action of users. This field is filled by users.

You could try to watch PriorityClass in estimator instead. If we have a request of empty PriorityClass field, the estimator tries to get default value of its cluster and run this plugin.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Garrybest I think client should explicitly specify the priorityClass because the default global priorityClass might be different among managed clusters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you mean that the administrator might set different default priority classes for each cluster, such as by PriorityClass.

So, the default priority in Karmada might not be accurate.

pkg/estimator/pb/types.go Outdated Show resolved Hide resolved
Comment on lines +26 to +28
registry := runtime.Registry{
resourcequota.Name: resourcequota.New,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
registry := runtime.Registry{
resourcequota.Name: resourcequota.New,
}
registry := runtime.Registry{}
if features.FeatureGate.Enabled(features.ResourceQuotaEstimate) {
registry.Register(resourcequota.Name, resourcequota.New) // TODO: we might need to deal with the unhandled error
}

Is it better? So that we can skip the effort to check the feature gate in multiple places when implementing the plugin.
In addition, a feature gate will be removed eventually once it moves to a stable, we don't need to touch the plugin logic when doing so.

Copy link
Contributor Author

@wengyao04 wengyao04 Jan 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of controlling in the registry, do you think if we can have feature-gate check in each plugin.New function like https://github.com/karmada-io/karmada/pull/4566/files#diff-e489afa02e13c3ea92d007b4a8575045b8695f34cd31539769f2e28c3c0cb322R72-R77 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm saying that just because I see there are two places we need to check the feature-gate in the plugin. :)
I'm not insisting on it, both will work. No big deal.

@wengyao04
Copy link
Contributor Author

I create an umbrella issue #4578

Copy link
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good to me.

@@ -399,6 +399,10 @@ func GenerateReplicaRequirements(podTemplate *corev1.PodTemplateSpec) *workv1alp
return &workv1alpha2.ReplicaRequirements{
NodeClaim: nodeClaim,
ResourceRequest: resourceRequest,
Namespace: podTemplate.Namespace,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add a feature gate check here? The two fields are unnecessary for users that do not enable this feature?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, we are sure that we have the opportunity to set the PriorityClassName in the if nodeClaim != nil || resourceRequest != nil, because the resourceRequest can not be nil, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I am not quite sure if I understand this question correct, I think we handle the nil replicaRequirement in estimator here

The default pb.ReplicaRequirements is empty, it will be populated only if ReplicaRequirements is not nil

	req := &pb.MaxAvailableReplicasRequest{
		Cluster:             cluster,
		ReplicaRequirements: pb.ReplicaRequirements{},
	}
	if replicaRequirements != nil {
		req.ReplicaRequirements.ResourceRequest = replicaRequirements.ResourceRequest
		if replicaRequirements.NodeClaim != nil {
			req.ReplicaRequirements.NodeClaim = &pb.NodeClaim{
				NodeAffinity: replicaRequirements.NodeClaim.HardNodeAffinity,
				NodeSelector: replicaRequirements.NodeClaim.NodeSelector,
				Tolerations:  replicaRequirements.NodeClaim.Tolerations,
			}
		}
	}

If the pb.ReplicaRequirements is empty, which means empty ResourceList, the current logic still calculate the replica

func (es *AccurateSchedulerEstimatorServer) nodeMaxAvailableReplica(node *framework.NodeInfo, rl corev1.ResourceList) int32 {
	rest := node.Allocatable.Clone().SubResource(node.Requested)
	// The number of pods in a node is a kind of resource in node allocatable resources.
	// However, total requested resources of all pods on this node, i.e. `node.Requested`,
	// do not contain pod resources. So after subtraction, we should cope with allowed pod
	// number manually which is the upper bound of this node available replicas.
	rest.AllowedPodNumber = util.MaxInt64(rest.AllowedPodNumber-int64(len(node.Pods)), 0)
	return int32(rest.MaxDivided(rl))
}

@Garrybest
Copy link
Member

/lgtm

@karmada-bot karmada-bot added lgtm Indicates that a PR is ready to be merged. and removed lgtm Indicates that a PR is ready to be merged. labels Jan 26, 2024
@RainbowMango
Copy link
Member

Just helped rebase the code.
Looks good to me now.

Copy link
Member

@RainbowMango RainbowMango left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@karmada-bot karmada-bot added the lgtm Indicates that a PR is ready to be merged. label Jan 28, 2024
@karmada-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@karmada-bot karmada-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 28, 2024
@karmada-bot karmada-bot merged commit fa4d6d3 into karmada-io:master Jan 28, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants