Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[umbrella] Add Implementation for Service discovery with native Kubernetes naming and resolution #4292

Open
14 of 25 tasks
jwcesign opened this issue Nov 21, 2023 · 6 comments
Open
14 of 25 tasks
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.

Comments

@jwcesign
Copy link
Member

jwcesign commented Nov 21, 2023

What would you like to be added:
In #4287, we propose a way to implement service discovery with native Kubernetes naming and resolution.

The specific work items are as follows:

release-1.8(deadline: 1130)

release-1.9 iteration-1 (deadline: 1207)

iteration-2 (deadline: 1223)

Legacy Issues:
Basic:

  • Currently, we have MCS, Service, Provider EndpointSlice work, and Consumer EndpointSlice work existing independently from each other. In the future, we consider implementing interaction through MCS status to optimize process control.
  • Sort out abnormal scenarios, and provide comprehensive things about traffic anomaly events.
  • Organize performance points, provide comprehensive metrics, and facilitate performance observation.
  • Construct a comprehensive mcs e2e test.
  • Construct a comprehensive mcs unit test.

Conflict:

  • Now, after dispatch the endpointslice to consumption cluster, label resourcetemplate.karmada.io/uid is empty, We should check if it's OK.
  • Conflict issue with MCI/LoadBalancer MCS: Both MCI and LoadBalancer MCS will synchronize EndpointSlice to the management side as Work. At this time, there will be a Work management conflict with CrossCluster MCS that needs to be resolved.
  • Conflicts with PP/CPP, control the RB together.

Performance:

  • Improving MCS Performance: In the case of 20 clusters, where ProviderCluster and ConsumerCluster are empty, a single Service will generate 20*19=380 Works. According to actual tests, Work management synchronization is relatively slow (on the minute level).

Reliability:

  • Reliability issues: Consider the following abnormal scenarios, how MCS avoids service access anomalies.
    - Karmada control plane anomaly, member cluster's EndpointSlice invalidation issue.
    - East-west network anomaly between clusters, service access anomaly.
    - Intra-cluster network anomaly, but the cluster health check passes, leading to cross-cluster access anomalies.
    - Cluster apiserver failure and normal network cause EndpointSlice synchronization deletion anomalies.

Experience Precipitation:

  • Based on MCS experience, precipitate the implementation standards for components, anomalies/events/performance/monitoring metrics; at the same time, define how to solve conflicts with previous resources through scenario-based Policy.

Why is this needed:

Ref proposal #4287

Help

Anyone who wants to participate in one of the tasks can come and assign the tasks. Feel free to let me know in the comments and I will mark it after the corresponding task item, thanks~

/help

@jwcesign jwcesign added the kind/feature Categorizes issue or PR as related to a new feature. label Nov 21, 2023
@karmada-bot
Copy link
Collaborator

@jwcesign:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

What would you like to be added:
In #4287, we propose a way to implement service discovery with native Kubernetes naming and resolution.

The specific work items are as follows:

release-1.8(1130)

release-1.9(1215)

  • For mcs-controller, List&watch cluster creation/deletion, reconcile the work in corresponding cluster execution namespace.
  • For mcs-eps-controller, List&watch cluster creation/deletion, reconcile the EndpointSlice's work in corresponding cluster execution namespace.
  • If cluster gets unhealth, mcs-eps-controller should delete the EndpointSlice from all the cluster execution namespace.

Why is this needed:

Ref proposal #4287

Help

Anyone who wants to participate in one of the tasks can come and assign the tasks. Feel free to let me know in the comments and I will mark it after the corresponding task item, thanks~

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@karmada-bot karmada-bot added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Nov 21, 2023
@RainbowMango
Copy link
Member

/assign @jwcesign @Rains6

@karmada-bot
Copy link
Collaborator

@RainbowMango: GitHub didn't allow me to assign the following users: Rains6.

Note that only karmada-io members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @jwcesign @Rains6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@RainbowMango RainbowMango moved this to In Progress in Karmada Release 1.8 Nov 24, 2023
@jwcesign
Copy link
Member Author

jwcesign commented Nov 25, 2023

Now, the EPS synchronization logic is as follows:
image

1.List&watch mcs, build informer to list&watch the eps in ServiceProvisionClusters
2.List&watch eps in ServiceProvisionClusters
3.create the corresponding work
4.Sync the endpointslice's work in step 3 to ServiceConsumptionClusters, the endpoint slice name will be changed to {source-cluster-name}-{endpointslcie-name}
5.The work will trigger to create the endpointslice int ServiceConsumptionClusters.

@RainbowMango
Copy link
Member

We are going to introduce 3 controllers for this feature, proposing the controller name here:

  • multiclusterservice : focus on dispatching services to member clusters
  • endpointsliceCollect: focus on collecting endpointslices from member clusters
  • endpointsliceDispatch: focus on dispatch endpointslices to member clusters

@Affan-7
Copy link
Member

Affan-7 commented Jan 13, 2024

Hi @jwcesign, can you please assign me something to work on?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature.
Projects
Status: In Progress
Development

No branches or pull requests

4 participants