-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CA: prepare for DRA integration #7529
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: towca The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
7723f8e
to
ed63ac6
Compare
/hold |
ed63ac6
to
705cb01
Compare
705cb01
to
2ecfad9
Compare
This flag will be used to guard any behavior-changing logic needed for DRA, to make it clear that existing behavior for non-DRA use-cases is preserved.
2ecfad9
to
de210cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, with some naming/clarification/typo comments and some reservation, as I am not expert on DRA and might miss some DRA-specific gaps. For full LGTM it would be nice to get feedback from more DRA-knowledgable folks.
}, | ||
"delta": func() (clustersnapshot.ClusterSnapshot, error) { | ||
fwHandle, err := framework.NewTestFrameworkHandle() | ||
if err != nil { | ||
return nil, err | ||
} | ||
return NewPredicateSnapshot(base.NewDeltaSnapshotBase(), fwHandle), nil | ||
return NewPredicateSnapshot(base.NewDeltaSnapshotBase(), fwHandle, false), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: any particular reasons why some tests are initialized with true and some with false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I missed these thanks. Changed all to true
, it seems better to test more code paths in unit tests.
schedulerframeworkruntime.WithSnapshotSharedLister(sharedLister), | ||
} | ||
|
||
if draEnabled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: It would be a bit more readable to place schedProfile :=
before if draEnabled
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
clusterSnapshot := store.NewBasicSnapshotStore() | ||
err := clusterSnapshot.AddNodeInfo(framework.NewTestNodeInfo(node1)) | ||
// with default predicate checker | ||
defaultPluginnRunner, clusterSnapshot, err := newTestPluginRunnerAndSnapshot(nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: defaultPluginnRunner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -305,25 +302,29 @@ func TestDebugInfo(t *testing.T) { | |||
|
|||
customConfig, err := scheduler.ConfigFromPath(customConfigFile) | |||
assert.NoError(t, err) | |||
customPluginRunner, err := newTestPluginRunner(clusterSnapshot, customConfig) | |||
customPluginnRunner, clusterSnapshot, err := newTestPluginRunnerAndSnapshot(customConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: customPluginnRunner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
draEnabled: draEnabled, | ||
} | ||
snapshot.pluginRunner = NewSchedulerPluginRunner(fwHandle, snapshot) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That hurts, but okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, any better ideas 😅?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If golang is a flat earth, I feel like we've fallen off the edge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one isn't actually a Golang issue, more of a code structure one. I added a comment explaining why it's there and left a TODO for improving it (I think a simple refactor would remove the PluginRunner
dependency on PredicateSnapshot
). In the interest of time and the 1.32 release I'd prefer not to tackle this one in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
) | ||
|
||
// Snapshot contains a snapshot of all DRA objects taken at a ~single point in time. | ||
type Snapshot struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some reservations to the naming: We will have ClusterSnapshot that has its ClusterSnapshotStore, which in turn has its drasnapshot.Snapshot. Do we want to use Snapshot here? Maybe SnapshotStore would be better, to make it consistent with our agreed indication of storing data? Maybe even full DraSnapshotStore, although this name is ugly and I don't like it (it's probably also not consistent with best practises).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, on the other hand in the context of the dynamicresources
package the name Snapshot
seems the most accurate to me (since we don't have the 2 layers or any interfaces here - why "Store"?), and it's very simple. So I guess it's a tradeoff between readability in clustersnapshot
and dynamicresources
? I'm leaning more towards keeping the simple dynamicresources.Snapshot
name, but it's not a strong opinion. @jackfrancis WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to note, I don't have a very hard preference here, I am reasonably okay with merging it as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I'd prefer to keep the current name, if only to reduce the rebasing because of time constraints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd lean towards Snapshot
as well (w/ the namespacing of the dynamicresources
package providing some context to disambiguate between other snapshot things in the source surface area)
de210cb
to
45efec6
Compare
@@ -30,6 +31,19 @@ import ( | |||
type ClusterSnapshot interface { | |||
ClusterSnapshotStore | |||
|
|||
// AddNodeInfo adds the given NodeInfo to the snapshot without checking scheduler predicates. The Node and the Pods are added, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are these additions here, they're already in master branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lol nevermind I didn't read far enough :) I see they have moved
func profileWithDraPlugin(profile *schedulerconfig.KubeSchedulerProfile) *schedulerconfig.KubeSchedulerProfile { | ||
result := profile.DeepCopy() | ||
addPluginIfNotPresent(result.Plugins.PreFilter, draplugin.Name) | ||
addPluginIfNotPresent(result.Plugins.Filter, draplugin.Name) | ||
addPluginIfNotPresent(result.Plugins.Reserve, draplugin.Name) | ||
return result | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurred to me casually "hmm, could we refactor this for more slice-y expressiveness here, or make addPluginIfNotPresent
accept a slice of schedulerconfig.PluginSet
s? And then it occurred to me that I wouldn't dare do that without UT. So here's a basic suggestion here:
func profileWithDraPlugin(profile *schedulerconfig.KubeSchedulerProfile) *schedulerconfig.KubeSchedulerProfile { | |
result := profile.DeepCopy() | |
addPluginIfNotPresent(result.Plugins.PreFilter, draplugin.Name) | |
addPluginIfNotPresent(result.Plugins.Filter, draplugin.Name) | |
addPluginIfNotPresent(result.Plugins.Reserve, draplugin.Name) | |
return result | |
} | |
func profileWithDraPlugin(profile *schedulerconfig.KubeSchedulerProfile) *schedulerconfig.KubeSchedulerProfile { | |
result := profile.DeepCopy() | |
if result != nil && result.Plugins != nil { | |
for _, pluginSet := range []schedulerconfig.PluginSet{ | |
result.Plugins.PreFilter, | |
result.Plugins.Filter, | |
result.Plugins.Reserve, | |
} { | |
addPluginIfNotPresent(pluginSet, draplugin.Name) | |
} | |
} | |
return result | |
} |
And here's a quick and dirty handle_test.go
that will protect the existing implementation (this includes my nil handling additions) so in case someone else thinks to refactor later the UT will ensure that the outcomes don't change in unexpected ways.
/*
Copyright 2024 The Kubernetes Authors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package framework
import (
"testing"
"github.com/google/go-cmp/cmp"
draplugin "k8s.io/kubernetes/pkg/scheduler/framework/plugins/dynamicresources"
schedulerconfig "k8s.io/kubernetes/pkg/scheduler/apis/config"
schedulerconfiglatest "k8s.io/kubernetes/pkg/scheduler/apis/config/latest"
)
// TestProfileWithDraPlugin tests that the profileWithDraPlugin function correctly adds the DRA plugin to the profile.
func TestProfileWithDraPlugin(t *testing.T) {
schedConfig, err := schedulerconfiglatest.Default()
cmp.Diff(err, nil)
cmp.Diff(len(schedConfig.Profiles) > 0, true)
schedProfile := &schedConfig.Profiles[0]
tests := []struct {
name string
profile *schedulerconfig.KubeSchedulerProfile
expectedDRAPlugins bool
}{
{
name: "* to struct literal input",
profile: &schedulerconfig.KubeSchedulerProfile{},
expectedDRAPlugins: false,
},
{
name: "nil input",
profile: nil,
expectedDRAPlugins: false,
},
{
name: "default",
profile: schedProfile,
expectedDRAPlugins: true,
},
}
for _, test := range tests {
test := test
t.Run(test.name, func(t *testing.T) {
actual := profileWithDraPlugin(test.profile)
found := map[string]bool{
"PreFilter": false,
"Filter": false,
"Reserve": false,
}
if test.expectedDRAPlugins {
for _, enabledPreFilterPlugins := range actual.Plugins.PreFilter.Enabled {
if enabledPreFilterPlugins.Name == draplugin.Name {
found["PreFilter"] = true
}
}
for _, enabledFilterPlugins := range actual.Plugins.Filter.Enabled {
if enabledFilterPlugins.Name == draplugin.Name {
found["Filter"] = true
}
}
for _, enabledReservePlugins := range actual.Plugins.Reserve.Enabled {
if enabledReservePlugins.Name == draplugin.Name {
found["Reserve"] = true
}
}
cmp.Diff(found, map[string]bool{
"PreFilter": true,
"Filter": true,
"Reserve": true,
})
}
})
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof, thanks for keeping me honest with the unit testing here. I tried your test and it turns out the implementation doesn't actually add the DRA plugin to the config at all 🤦 I assumed it does because the integration tests pass, but I checked and the DRA plugin is actually just present in the default config. So no modifications to the plugin sets should be needed here, I removed that part.
I'm pretty sure this was needed before 1.32, but maybe it was a red herring all along..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DRA plugin is actually just present in the default config
this discovery isn't problematic, is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't be, the plugin is only doing anything if the relevant DRA feature gate is enabled in the cluster. This is what I expected to happen all along - we shouldn't need to hack the plugin sets, the plugins should be enabled/disabled via feature gates without CA having to get involved.
Make SharedDRAManager a part of the ClusterSnapshotStore interface, and implement dummy methods to satisfy the interface. Actual implementation will come in later commits. This is needed so that ClusterSnapshot can feed DRA objects to the DRA scheduler plugin, and obtain ResourceClaim modifications back from it. The integration is behind the DRA flag guard, this should be a no-op if the flag is disabled.
This should be a no-op, as no DRA objects are passed yet.
…fying DRA objects A new DRA Snapshot type is introduced, for now with just dummy methods to be implemented in later commits. The new type is intended to hold all DRA objects in the cluster. ClusterSnapshotStore.SetClusterState() is extended to take the new DRA Snapshot in addition to the existing parameters. ClusterSnapshotStore.DraSnapshot() is added to retrieve the DRA snapshot set by SetClusterState() back. This will be used by PredicateSnapshot to implement DRA logic later. This should be a no-op, as DraSnapshot() is never called, and no DRA snapshot is passed to SetClusterState() yet.
Store the DRA snapshot inside the current internal data in SetClusterState(). Retrieve the DRA snapshot from the current internal data in DraSnapshot(). Clone the DRA snapshot whenever the internal data is cloned during Fork(). This matches the forking logic that BasicSnapshotStore uses, ensuring that the DRA object state is correctly forked/commited/reverted during the corresponding ClusterSnapshot operations. This should be a no-op, as DraSnapshot() isn't called anywhere yet, adn no DRA snapshot is passed to SetClusterState() yet.
45efec6
to
ce440ac
Compare
All the NodeInfo methods have to take DRA into account, and the logic for that will be the same for different ClusterSnapshotStore implementations. Instead of duplicating the new logic in Basic and Delta, the methods are moved to ClusterSnapshot and the logic will be implemented once in PredicateSnapshot. PredicateSnapshot will use the DRA Snapshot exposed by its ClusterSnapshotStore to implement these methods. The DRA Snapshot has to be stored in the ClusterSnapshotStore layer, as we need to be able to fork/commit/revert it. Lower-level methods for adding/removing just the schedulerframework.NodeInfo parts are added to ClusterSnapshotStore. PredicateSnapshot utilizes these methods to implement AddNodeInfo and RemoveNodeInfo. This should be a no-op, it's just a refactor.
RunReserveOnNode runs the Reserve phase of schedulerframework, which is necessary to obtain ResourceClaim allocations computed by the DRA scheduler plugin. RunReserveOnNode isn't used anywhere yet, so this should be a no-op.
ce440ac
to
0691512
Compare
/lgtm |
/unhold |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
This is a part of Dynamic Resource Allocation (DRA) support in Cluster Autoscaler.
This PR implements the first bits of the DRA integration in CA:
SharedDRAManager
interface introduced in DRA: allow Cluster Autoscaler to integrate with DRA scheduler plugin kubernetes#127904.ClusterSnapshot
interface is extended to cover DRA objects.SchedulerPluginRunner
is extended so that it can run theReserve
phase of the framework.Which issue(s) this PR fixes:
The CA/DRA integration is tracked in kubernetes/kubernetes#118612, this is just part of the implementation.
Special notes for your reviewer:
This is mostly a refactor and some preparations for the actual DRA logic in the next PR. All new logic is behind a DRA flag guard, the intention is to have no change in behavior if the flag is disabled.
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: