-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MVP] add resourcequota plugin in scheduler-estimator: add resourcequota plugin #4566
[MVP] add resourcequota plugin in scheduler-estimator: add resourcequota plugin #4566
Conversation
Codecov ReportAttention:
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #4566 +/- ##
==========================================
+ Coverage 51.71% 51.94% +0.22%
==========================================
Files 247 248 +1
Lines 24419 24634 +215
==========================================
+ Hits 12629 12795 +166
- Misses 11103 11142 +39
- Partials 687 697 +10
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, I am not so familiar with resource quota, kindly invite @RainbowMango and @XiShanYongYe-Chang to take a look as well.
|
||
// Namespace represents the resources namespaces | ||
// +optional | ||
Namespace string `json:"namespace,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated with ObjectReference.Namespace
, why not share? Did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is because the ReplicaEstimator interface shares the same ReplicaRequirements
struct with ResourceBinding:
karmada/pkg/estimator/client/interface.go
Line 39 in f054313
MaxAvailableReplicas(ctx context.Context, clusters []*clusterv1alpha1.Cluster, replicaRequirements *workv1alpha2.ReplicaRequirements) ([]workv1alpha2.TargetCluster, error) |
I also noticed this issue at #4534 (comment).
Another reason just as mentioned in PR description:
We discuss if we need to add duplicate namespace filed workv1alpha2.ReplicaRequirements. I add it because we also need that in the resource interpreter https://github.com/karmada-io/karmada/blob/master/pkg/resourceinterpreter/interpreter.go#L46-L47
if we don't include the namespace in workv1alpha2.ReplicaRequirements, I have to make this interface change to return namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This issue is now tracked by #4578.
pkg/util/helper/binding.go
Outdated
NodeClaim: nodeClaim, | ||
ResourceRequest: resourceRequest, | ||
Namespace: podTemplate.Namespace, | ||
PriorityClassName: podTemplate.Spec.PriorityClassName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember priority class name can be left empty in pod template and K8s pod admission hook will set the default priority class name for pod. So this name could be empty.
Do we handle this default value in the plugin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this MVP, we probably not, otherwise I need to listWatch PriorityClass in the karmada controller. Besides that, I think client should explicitly specify the priorityClass because the default global priorityClass might be different among managed clusters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
priorityClassName will be empty
Then the plugin will be skipped for those without priorityClassName, right?
That's kind of a limitation, we can do that for this MVP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be handled by the matchScope
function https://github.com/karmada-io/karmada/pull/4566/files#diff-e489afa02e13c3ea92d007b4a8575045b8695f34cd31539769f2e28c3c0cb322R240-R261
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides that, I think client should explicitly specify the priorityClass because the default global priorityClass might be different among managed clusters
But we can't determine the action of users. This field is filled by users.
You could try to watch PriorityClass in estimator instead. If we have a request of empty PriorityClass field, the estimator tries to get default value of its cluster and run this plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Garrybest I think client should explicitly specify the priorityClass because the default global priorityClass might be different among managed clusters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you mean that the administrator might set different default priority classes for each cluster, such as by PriorityClass.
So, the default priority in Karmada might not be accurate.
pkg/estimator/server/framework/plugins/resourcequota/resourcequota.go
Outdated
Show resolved
Hide resolved
pkg/estimator/server/framework/plugins/resourcequota/resourcequota.go
Outdated
Show resolved
Hide resolved
pkg/estimator/server/framework/plugins/resourcequota/resourcequota.go
Outdated
Show resolved
Hide resolved
registry := runtime.Registry{ | ||
resourcequota.Name: resourcequota.New, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
registry := runtime.Registry{ | |
resourcequota.Name: resourcequota.New, | |
} | |
registry := runtime.Registry{} | |
if features.FeatureGate.Enabled(features.ResourceQuotaEstimate) { | |
registry.Register(resourcequota.Name, resourcequota.New) // TODO: we might need to deal with the unhandled error | |
} |
Is it better? So that we can skip the effort to check the feature gate in multiple places when implementing the plugin.
In addition, a feature gate will be removed eventually once it moves to a stable, we don't need to touch the plugin logic when doing so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of controlling in the registry, do you think if we can have feature-gate check in each plugin.New function like https://github.com/karmada-io/karmada/pull/4566/files#diff-e489afa02e13c3ea92d007b4a8575045b8695f34cd31539769f2e28c3c0cb322R72-R77 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm saying that just because I see there are two places we need to check the feature-gate in the plugin. :)
I'm not insisting on it, both will work. No big deal.
pkg/estimator/server/framework/plugins/resourcequota/resourcequota.go
Outdated
Show resolved
Hide resolved
I create an umbrella issue #4578 |
pkg/estimator/server/framework/plugins/resourcequota/resourcequota.go
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good to me.
pkg/util/helper/binding.go
Outdated
@@ -399,6 +399,10 @@ func GenerateReplicaRequirements(podTemplate *corev1.PodTemplateSpec) *workv1alp | |||
return &workv1alpha2.ReplicaRequirements{ | |||
NodeClaim: nodeClaim, | |||
ResourceRequest: resourceRequest, | |||
Namespace: podTemplate.Namespace, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add a feature gate check here? The two fields are unnecessary for users that do not enable this feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, we are sure that we have the opportunity to set the PriorityClassName
in the if nodeClaim != nil || resourceRequest != nil
, because the resourceRequest
can not be nil, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I am not quite sure if I understand this question correct, I think we handle the nil replicaRequirement in estimator here
The default pb.ReplicaRequirements
is empty, it will be populated only if ReplicaRequirements is not nil
req := &pb.MaxAvailableReplicasRequest{
Cluster: cluster,
ReplicaRequirements: pb.ReplicaRequirements{},
}
if replicaRequirements != nil {
req.ReplicaRequirements.ResourceRequest = replicaRequirements.ResourceRequest
if replicaRequirements.NodeClaim != nil {
req.ReplicaRequirements.NodeClaim = &pb.NodeClaim{
NodeAffinity: replicaRequirements.NodeClaim.HardNodeAffinity,
NodeSelector: replicaRequirements.NodeClaim.NodeSelector,
Tolerations: replicaRequirements.NodeClaim.Tolerations,
}
}
}
If the pb.ReplicaRequirements is empty, which means empty ResourceList, the current logic still calculate the replica
func (es *AccurateSchedulerEstimatorServer) nodeMaxAvailableReplica(node *framework.NodeInfo, rl corev1.ResourceList) int32 {
rest := node.Allocatable.Clone().SubResource(node.Requested)
// The number of pods in a node is a kind of resource in node allocatable resources.
// However, total requested resources of all pods on this node, i.e. `node.Requested`,
// do not contain pod resources. So after subtraction, we should cope with allowed pod
// number manually which is the upper bound of this node available replicas.
rest.AllowedPodNumber = util.MaxInt64(rest.AllowedPodNumber-int64(len(node.Pods)), 0)
return int32(rest.MaxDivided(rl))
}
/lgtm |
…ota plugin Signed-off-by: yweng14 <[email protected]>
Just helped rebase the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: RainbowMango The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
This is the followup PR of #4534
The resourceQuotaEstimator plugin will include
Namespace
andPriorityClassName
inpkg/estimator/pb/types.go
Namespace
andPriorityClassName
inpkg/apis/work/v1alpha2/binding_types.go
We discuss if we need to add duplicate
namespace
filedworkv1alpha2.ReplicaRequirements
. I add it because we also need that in the resource interpreter https://github.com/karmada-io/karmada/blob/master/pkg/resourceinterpreter/interpreter.go#L46-L47if we don't include the namespace in
workv1alpha2.ReplicaRequirements
, I have to make this interface change to return namespace.What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Fixes ##4369
Special notes for your reviewer:
@RainbowMango, @Garrybest, @chaosi-zju could you help check ? Thx
Does this PR introduce a user-facing change?: