Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dualstack CNS Changes #1773

Merged
merged 80 commits into from
Apr 4, 2023
Merged

Conversation

rjdenney
Copy link
Contributor

Reason for Change:

This is the CNS side of changes for dualstack in Linux in Windows. Azure-ipam changes will follow once this is merged

Issue Fixed:

Requirements:

Notes:

@rjdenney rjdenney added the cns Related to CNS. label Jan 25, 2023
@rjdenney rjdenney requested a review from a team as a code owner January 25, 2023 22:16
@rjdenney rjdenney requested review from rsagasthya and removed request for a team January 25, 2023 22:16
@rjdenney rjdenney force-pushed the dualstack-cilium-implementation-CNS branch 19 times, most recently from 69e8dcc to 86c55e7 Compare January 31, 2023 22:27
@rjdenney rjdenney mentioned this pull request Feb 1, 2023
3 tasks
@rjdenney rjdenney requested review from rbtr and thatmattlong February 3, 2023 00:03
} else {
logger.Errorf("[AssignDesiredIPConfigs] Desired IP is already assigned %+v, requested for pod %+v", ipConfig, podInfo)
//nolint:goerr113 // return error
return podIPInfo, fmt.Errorf("IP already assigned to another pod")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you return here, what happens to the other desired IPs you may have assigned already in this loop?
is the caller function releasing those back?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All assigning is done after this loop now

return podIPInfo, fmt.Errorf("IP already assigned to another pod")
}
case types.Available, types.PendingProgramming:
// This race can happen during restart, where CNS state is lost and thus we have lost the NC programmed version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original function is calling service.assignIPConfig. Why is that not getting called anymore?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may indicate that there is no test catching this bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting called outside of this for loop in the next for loop. To avoid assigning IPs and then needing to release them again I just add them to a map here (now changed to a slice) to be looped through and added

default:
logger.Errorf("[AssignDesiredIPConfigs] Desired IP is not available %+v", ipConfig)
//nolint:goerr113 // return error
return podIPInfo, fmt.Errorf("IO not available")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above -
if you return here, what happens to the other desired IPs you may have assigned already in this loop?
is the caller function releasing those back?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing will be assigned yet.

if err := service.populateIPConfigInfoUntransacted(ipState, &podIPInfo); err != nil {
return cns.PodIpInfo{}, err
}
podIPctr := 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for clarity please use vars such as numIpConfigsToAssign and numIpConfigsAssigned

}

// If IPConfig is already assigned to pod, it returns that else it returns one of the available ipconfigs.
func requestIPConfigHelper(service *HTTPRestService, req cns.IPConfigRequest) (cns.PodIpInfo, error) {
func requestIPConfigHelper(service *HTTPRestService, req cns.IPConfigsRequest) ([]cns.PodIpInfo, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the func name be updated to reflect it's handling multiple IP configs

t.Fatal("Expected available ips to be zero since we expect the IP to still be assigned")
}
}
// func TestIPAMFailToReleaseOneIPWhenExpectedToHaveTwo(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function will need to be updated which is why it is commented out right now


_, err = requestIPAddressAndGetState(t, req)
_, err := requestIPAddressAndGetState(t, req)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add a test where you specify 2 desired IPs - first one is valid while second one is invald.
In addition to validating that we get non-nil error, we should also confirm that we are not partially assigning IPs. This will add coverage to the code where in case of partial success, we release the assigned IP back to the pool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add this

ncIDs := []string{testNCID, testNCIDv6}
IPs := []string{testIP1, testIP1v6}
prefixes := []uint8{IPPrefixBitsv4, IPPrefixBitsv6}
IPAMFailToGetDesiredIPConfigWithAlreadyAssignedSpecfiedIP(t, ncIDs, IPs, prefixes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should validate that we don't allow partial success

}
err := service.populateIPConfigInfoUntransacted(ipConfig, &podIpInfo)
return podIpInfo, err
case types.Available, types.PendingProgramming:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a todo / question on why we would attempt to assign ip which is pendingProgramming?
If the IP state is pendingProgramming for the desired IP, we should fail the request as it may cause the data path issue if we assign it. @rbtr

Copy link
Contributor Author

@rjdenney rjdenney Mar 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add something. I thought this was odd myself when I first saw it in the code. I believe the comment below is supposed to explain why and it sounds like we need this for when CNS restarts and marking everything back as assigned that has already been assigned. I would like to know more about this though.

// This race can happen during restart, where CNS state is lost and thus we have lost the NC programmed version
// As part of reconcile, we mark IPs as Assigned which are already assigned to Pods (listed from APIServer)

ashvindeodhar
ashvindeodhar previously approved these changes Mar 31, 2023
rbtr
rbtr previously approved these changes Apr 3, 2023
Copy link
Contributor

@rbtr rbtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"approved with suggestions"
mostly to remove my block

@ashvindeodhar
Copy link
Member

Thanks @rbtr for the review. @rjdenney can you pls rebase the PR and merge? Pls address the comments from Evan in a fast follow PR

@rjdenney rjdenney dismissed stale reviews from rbtr and ashvindeodhar via 30a15c2 April 3, 2023 23:15
@rjdenney rjdenney force-pushed the dualstack-cilium-implementation-CNS branch from 30a15c2 to a33291a Compare April 3, 2023 23:18
@ashvindeodhar ashvindeodhar merged commit 3e06a07 into master Apr 4, 2023
@ashvindeodhar ashvindeodhar deleted the dualstack-cilium-implementation-CNS branch April 4, 2023 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cns Related to CNS.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants