-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node-group-auto-discovery support for oci #7403
Changes from 6 commits
e4ff4ad
5ae6fa4
ee70dce
02c1e04
e79bffe
cf061f3
6396e63
1bf5f72
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,6 +9,7 @@ import ( | |
"fmt" | ||
"math" | ||
"os" | ||
"regexp" | ||
"strconv" | ||
"strings" | ||
"time" | ||
|
@@ -34,6 +35,11 @@ import ( | |
const ( | ||
maxAddTaintRetries = 5 | ||
maxGetNodepoolRetries = 3 | ||
clusterId = "clusterId" | ||
compartmentId = "compartmentId" | ||
nodepoolTags = "nodepoolTags" | ||
min = "min" | ||
max = "max" | ||
) | ||
|
||
var ( | ||
|
@@ -75,10 +81,11 @@ type okeClient interface { | |
GetNodePool(context.Context, oke.GetNodePoolRequest) (oke.GetNodePoolResponse, error) | ||
UpdateNodePool(context.Context, oke.UpdateNodePoolRequest) (oke.UpdateNodePoolResponse, error) | ||
DeleteNode(context.Context, oke.DeleteNodeRequest) (oke.DeleteNodeResponse, error) | ||
ListNodePools(ctx context.Context, request oke.ListNodePoolsRequest) (oke.ListNodePoolsResponse, error) | ||
} | ||
|
||
// CreateNodePoolManager creates an NodePoolManager that can manage autoscaling node pools | ||
func CreateNodePoolManager(cloudConfigPath string, discoveryOpts cloudprovider.NodeGroupDiscoveryOptions, kubeClient kubernetes.Interface) (NodePoolManager, error) { | ||
func CreateNodePoolManager(cloudConfigPath string, nodeGroupAutoDiscoveryList []string, discoveryOpts cloudprovider.NodeGroupDiscoveryOptions, kubeClient kubernetes.Interface) (NodePoolManager, error) { | ||
|
||
var err error | ||
var configProvider common.ConfigurationProvider | ||
|
@@ -151,6 +158,20 @@ func CreateNodePoolManager(cloudConfigPath string, discoveryOpts cloudprovider.N | |
nodePoolCache: newNodePoolCache(&okeClient), | ||
} | ||
|
||
// auto discover nodepools from compartments with nodeGroupAutoDiscovery parameter | ||
klog.Infof("checking node groups for autodiscovery ... ") | ||
for _, arg := range nodeGroupAutoDiscoveryList { | ||
nodeGroup, err := nodeGroupFromArg(arg) | ||
if err != nil { | ||
return nil, fmt.Errorf("unable to construct node group auto discovery from argument: %v", err) | ||
} | ||
nodeGroup.manager = manager | ||
nodeGroup.kubeClient = kubeClient | ||
|
||
manager.nodeGroups = append(manager.nodeGroups, *nodeGroup) | ||
autoDiscoverNodeGroups(manager, manager.okeClient, *nodeGroup) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given that the auto discovery happens in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nevermind, below in the forceRefresh function, we also |
||
} | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It seems like node-pools that were explicitly configured via Do you agree? That also raises the question of the expected behavior of, say, the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
What I can think of as a solution,
Please let me know of your thoughts and I will proceed accordingly to make the changes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The convention seems to be to warn against using it in the docs[1,2], and/or disallow [1] it in the code. I'm fine with either documenting it and/or errorring out. As you mentioned, currently the code quietly overrides any static node-pools while also logging messages as it processes each static-node pool, which could cause confusion. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added extra checks to prevent using both parameters together, and also documented it in oci/README. |
||
// Contains all the specs from the args that give us the pools. | ||
for _, arg := range discoveryOpts.NodeGroupSpecs { | ||
np, err := nodePoolFromArg(arg) | ||
|
@@ -180,6 +201,48 @@ func CreateNodePoolManager(cloudConfigPath string, discoveryOpts cloudprovider.N | |
return manager, nil | ||
} | ||
|
||
func autoDiscoverNodeGroups(m *ociManagerImpl, okeClient okeClient, nodeGroup nodeGroupAutoDiscovery) (bool, error) { | ||
var resp, reqErr = okeClient.ListNodePools(context.Background(), oke.ListNodePoolsRequest{ | ||
ClusterId: common.String(nodeGroup.clusterId), | ||
CompartmentId: common.String(nodeGroup.compartmentId), | ||
}) | ||
if reqErr != nil { | ||
klog.Errorf("failed to fetch the nodepool list with clusterId: %s, compartmentId: %s. Error: %v", nodeGroup.clusterId, nodeGroup.compartmentId, reqErr) | ||
return false, reqErr | ||
} | ||
for _, nodePoolSummary := range resp.Items { | ||
if validateNodepoolTags(nodeGroup.tags, nodePoolSummary.FreeformTags, nodePoolSummary.DefinedTags) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are a few types of tags including defined tags and free form tags. As I understand it, user defined tags on a Node Pool resource would appear in the form tags (i.e. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is not only Freeform tags. Users can also create their own namespace and defined tags. We check both of them to make sure we don't miss a tag applied by the user. Defined tag holds a namespace but FreeForm tag does not.
When we query Nodepool through api, the response returns them in separate fields.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see. |
||
nodepool := &nodePool{} | ||
nodepool.id = *nodePoolSummary.Id | ||
nodepool.minSize = nodeGroup.minSize | ||
nodepool.maxSize = nodeGroup.maxSize | ||
|
||
nodepool.manager = nodeGroup.manager | ||
nodepool.kubeClient = nodeGroup.kubeClient | ||
|
||
m.staticNodePools[nodepool.id] = nodepool | ||
klog.V(5).Infof("auto discovered nodepool in compartment : %s , nodepoolid: %s", nodeGroup.compartmentId, nodepool.id) | ||
} else { | ||
klog.Warningf("nodepool ignored as the tags do not satisfy the requirement : %s , %v, %v", *nodePoolSummary.Id, nodePoolSummary.FreeformTags, nodePoolSummary.DefinedTags) | ||
} | ||
} | ||
return true, nil | ||
} | ||
|
||
func validateNodepoolTags(nodeGroupTags map[string]string, freeFormTags map[string]string, definedTags map[string]map[string]interface{}) bool { | ||
if nodeGroupTags != nil { | ||
for tagKey, tagValue := range nodeGroupTags { | ||
namespacedTagKey := strings.Split(tagKey, ".") | ||
if len(namespacedTagKey) == 2 && tagValue != definedTags[namespacedTagKey[0]][namespacedTagKey[1]] { | ||
return false | ||
} else if len(namespacedTagKey) != 2 && tagValue != freeFormTags[tagKey] { | ||
return false | ||
} | ||
} | ||
} | ||
return true | ||
} | ||
|
||
// nodePoolFromArg parses a node group spec represented in the form of `<minSize>:<maxSize>:<ocid>` and produces a node group spec object | ||
func nodePoolFromArg(value string) (*nodePool, error) { | ||
tokens := strings.SplitN(value, ":", 3) | ||
|
@@ -207,6 +270,78 @@ func nodePoolFromArg(value string) (*nodePool, error) { | |
return spec, nil | ||
} | ||
|
||
// nodeGroupFromArg parses a node group spec represented in the form of | ||
// `clusterId:<clusterId>,compartmentId:<compartmentId>,nodepoolTags:<tagKey1>=<tagValue1>&<tagKey2>=<tagValue2>,min:<min>,max:<max>` | ||
// and produces a node group auto discovery object, | ||
// nodepoolTags are optional and CA will capture all nodes if no tags are provided. | ||
func nodeGroupFromArg(value string) (*nodeGroupAutoDiscovery, error) { | ||
// this regex will find the key-value pairs in any given order if separated with a colon | ||
regexPattern := `(?:` + compartmentId + `:(?P<` + compartmentId + `>[^,]+)` | ||
regexPattern = regexPattern + `|` + nodepoolTags + `:(?P<` + nodepoolTags + `>[^,]+)` | ||
regexPattern = regexPattern + `|` + max + `:(?P<` + max + `>[^,]+)` | ||
regexPattern = regexPattern + `|` + min + `:(?P<` + min + `>[^,]+)` | ||
regexPattern = regexPattern + `|` + clusterId + `:(?P<` + clusterId + `>[^,]+)` | ||
regexPattern = regexPattern + `)(?:,|$)` | ||
|
||
re := regexp.MustCompile(regexPattern) | ||
|
||
parametersMap := make(map[string]string) | ||
|
||
// push key-value pairs into a map | ||
for _, match := range re.FindAllStringSubmatch(value, -1) { | ||
for i, name := range re.SubexpNames() { | ||
if i != 0 && match[i] != "" { | ||
parametersMap[name] = match[i] | ||
} | ||
} | ||
} | ||
|
||
spec := &nodeGroupAutoDiscovery{} | ||
|
||
if parametersMap[clusterId] != "" { | ||
spec.clusterId = parametersMap[clusterId] | ||
} else { | ||
return nil, fmt.Errorf("failed to set %s, it is missing in node-group-auto-discovery parameter", clusterId) | ||
} | ||
|
||
if parametersMap[compartmentId] != "" { | ||
spec.compartmentId = parametersMap[compartmentId] | ||
} else { | ||
return nil, fmt.Errorf("failed to set %s, it is missing in node-group-auto-discovery parameter", compartmentId) | ||
} | ||
|
||
if size, err := strconv.Atoi(parametersMap[min]); err == nil { | ||
spec.minSize = size | ||
} else { | ||
return nil, fmt.Errorf("failed to set %s size: %s, expected integer", min, parametersMap[min]) | ||
} | ||
|
||
if size, err := strconv.Atoi(parametersMap[max]); err == nil { | ||
spec.maxSize = size | ||
} else { | ||
return nil, fmt.Errorf("failed to set %s size: %s, expected integer", max, parametersMap[max]) | ||
} | ||
|
||
if parametersMap[nodepoolTags] != "" { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it valid not to specify node pool tags? Or are they required? This will silently continue on if there are no nodePoolTags specified. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have made nodepooltags optional at the beginning, but then I noticed we may also need to support instancepooltags after @jlamillan's feedbacks. |
||
nodepoolTags := parametersMap[nodepoolTags] | ||
|
||
spec.tags = make(map[string]string) | ||
|
||
pairs := strings.Split(nodepoolTags, "&") | ||
|
||
for _, pair := range pairs { | ||
parts := strings.Split(pair, "=") | ||
if len(parts) == 2 { | ||
spec.tags[parts[0]] = parts[1] | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we be returning an error if the length is not 2? Or is that a valid use case? Right now we will just silently continue on. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, this would actually be a formatting error, I've added an else statement to fix it. |
||
} | ||
} | ||
|
||
klog.Infof("node group auto discovery spec constructed: %+v", spec) | ||
|
||
return spec, nil | ||
} | ||
|
||
type ociManagerImpl struct { | ||
cfg *ocicommon.CloudConfig | ||
okeClient okeClient | ||
|
@@ -215,6 +350,7 @@ type ociManagerImpl struct { | |
ociTagsGetter ocicommon.TagsGetter | ||
registeredTaintsGetter RegisteredTaintsGetter | ||
staticNodePools map[string]NodePool | ||
nodeGroups []nodeGroupAutoDiscovery | ||
|
||
lastRefresh time.Time | ||
|
||
|
@@ -253,6 +389,15 @@ func (m *ociManagerImpl) TaintToPreventFurtherSchedulingOnRestart(nodes []*apiv1 | |
} | ||
|
||
func (m *ociManagerImpl) forceRefresh() error { | ||
// auto discover node groups | ||
if m.nodeGroups != nil { | ||
// empty previous nodepool map to do an auto discovery | ||
m.staticNodePools = make(map[string]NodePool) | ||
for _, nodeGroup := range m.nodeGroups { | ||
autoDiscoverNodeGroups(m, m.okeClient, nodeGroup) | ||
} | ||
} | ||
// rebuild nodepool cache | ||
err := m.nodePoolCache.rebuild(m.staticNodePools, maxGetNodepoolRetries) | ||
if err != nil { | ||
return err | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -336,6 +336,10 @@ func (c mockOKEClient) DeleteNode(context.Context, oke.DeleteNodeRequest) (oke.D | |
}, nil | ||
} | ||
|
||
func (c mockOKEClient) ListNodePools(context.Context, oke.ListNodePoolsRequest) (oke.ListNodePoolsResponse, error) { | ||
return oke.ListNodePoolsResponse{}, nil | ||
} | ||
|
||
func TestRemoveInstance(t *testing.T) { | ||
instanceId1 := "instance1" | ||
instanceId2 := "instance2" | ||
|
@@ -384,3 +388,70 @@ func TestRemoveInstance(t *testing.T) { | |
} | ||
} | ||
} | ||
|
||
func TestNodeGroupFromArg(t *testing.T) { | ||
var nodeGroupArg = "clusterId:testClusterId,compartmentId:testCompartmentId,nodepoolTags:ca-managed=true&namespace.foo=bar,min:1,max:5" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Minor nit, but can we update the IDs to look like real ocids just in case there are weird parsing bugs. For example
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done. |
||
nodeGroupAutoDiscovery, err := nodeGroupFromArg(nodeGroupArg) | ||
if err != nil { | ||
t.Errorf("Error: #{err}") | ||
} | ||
if nodeGroupAutoDiscovery.clusterId != "testClusterId" { | ||
t.Errorf("Error: clusterId should be testClusterId") | ||
} | ||
if nodeGroupAutoDiscovery.compartmentId != "testCompartmentId" { | ||
t.Errorf("Error: compartmentId should be testCompartmentId") | ||
} | ||
if nodeGroupAutoDiscovery.minSize != 1 { | ||
t.Errorf("Error: minSize should be 1") | ||
} | ||
if nodeGroupAutoDiscovery.maxSize != 5 { | ||
t.Errorf("Error: maxSize should be 5") | ||
} | ||
if nodeGroupAutoDiscovery.tags["ca-managed"] != "true" { | ||
t.Errorf("Error: ca-managed:true is missing in tags.") | ||
} | ||
if nodeGroupAutoDiscovery.tags["namespace.foo"] != "bar" { | ||
t.Errorf("Error: namespace.foo:bar is missing in tags.") | ||
} | ||
} | ||
|
||
func TestValidateNodePoolTags(t *testing.T) { | ||
|
||
var nodeGroupTags map[string]string = nil | ||
var nodePoolTags map[string]string = nil | ||
var definedTags map[string]map[string]interface{} = nil | ||
|
||
if validateNodepoolTags(nodeGroupTags, nodePoolTags, definedTags) == false { | ||
t.Errorf("validateNodepoolTags shouldn't return false for empty tags map") | ||
} | ||
|
||
nodeGroupTags = make(map[string]string) | ||
nodeGroupTags["test"] = "test" | ||
|
||
if validateNodepoolTags(nodeGroupTags, nodePoolTags, definedTags) == true { | ||
t.Errorf("validateNodepoolTags shouldn't return true for tags missing") | ||
} | ||
|
||
nodePoolTags = make(map[string]string) | ||
nodePoolTags["foo"] = "bar" | ||
|
||
if validateNodepoolTags(nodeGroupTags, nodePoolTags, definedTags) == true { | ||
t.Errorf("validateNodepoolTags shouldn't return true for not matching tags") | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any reason not to do something like a traditional table driven test like this? https://go.dev/wiki/TableDrivenTests My main concern with the current test is that if someone adds a new test case in the middle of this, it will mess up every test that comes after it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I refactored this test to meet the table driven test requirements. |
||
|
||
nodePoolTags["test"] = "test" | ||
|
||
if validateNodepoolTags(nodeGroupTags, nodePoolTags, definedTags) == false { | ||
t.Errorf("validateNodepoolTags shouldn't return false for matching tags") | ||
} | ||
|
||
nodeGroupTags["ns.tag1"] = "tag2" | ||
definedTagsMap := make(map[string]interface{}) | ||
definedTagsMap["tag1"] = "tag2" | ||
definedTags = make(map[string]map[string]interface{}) | ||
definedTags["ns"] = definedTagsMap | ||
|
||
if validateNodepoolTags(nodeGroupTags, nodePoolTags, definedTags) == false { | ||
t.Errorf("validateNodepoolTags shouldn't return false for namespaced tags") | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all of the fields in this required? Or are any optional? Can we specify in the comment string above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all of the fields are mandatory. I've added a statement to make it clear in readme.