Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate policy names for duplicates in multi zone mode #3837

Closed
sentinelleader opened this issue Feb 15, 2022 · 11 comments
Closed

Validate policy names for duplicates in multi zone mode #3837

sentinelleader opened this issue Feb 15, 2022 · 11 comments
Assignees
Labels
kind/bug A bug kind/cleanup Cleanup/refactor an existing component/code triage/accepted The issue was reviewed and is complete enough to start working on it

Comments

@sentinelleader
Copy link
Contributor

sentinelleader commented Feb 15, 2022

What happened?

On a multizone kuma deployment model, we have kuma global CP running in Universal mode with postgres as backend and zone cp's running in kubernetes mode. It seems while creating policies via global CP API, we dont validate the existence of any existing policy of same name for a specific policy type. As a result, once the zone cp sync's and starts to create policies as CRD objects, new policies will fail as there already exists a CRD object for that policy type with the same name.

Steps to Repro:

  1. Provision a kuma global cp in universal mode backed by postgres as the backend datastore
  2. Provision a kuma zone cp in a k8s cluster with kubernetes as the backend store
  3. Create two meshes say foo and bar
  4. Create any policy type for both the meshes with same name. For example create a traffic trace policy for both the mesh with same name trace-all-traffic via kumactl. Both apply request will succeed and we can validate via kumactl get traffic-traces -m <mesh-name>
$ kumactl get traffic-traces -m foo
MESH                    NAME                          AGE
foo                     trace-all-traffic             5d

$ kumactl get traffic-traces -m bar
MESH                    NAME                          AGE
bar                     trace-all-traffic             5d
  1. Check the logs on the zone cp side and the logs will be flooded with errors failing to create the CRD object because a duplicate entry already exists.
# errors in zone cp logs
2022-02-15T16:57:25.637Z        INFO    kds.reconcile   detected changes in the resources. Sending changes to the client.       {"resourceType": "DataplaneInsight", "client": "global"}
2022-02-15T16:57:25.718Z        INFO    kds-zone        creating a new resource from upstream   {"type": "TrafficTrace", "name": "trace-all-traffic", "mesh": "bar"}
2022-02-15T16:57:25.765Z        INFO    kds-zone        error during callback received, sending NACK    {"peer-id": "global", "err": "Resource already exists: type=\"TrafficTrace\" name=\"trace-all-traffic\" mesh=\"bar\""}

Solution

Add a validation step in the API to check for the existence of a duplicate policy with same name for a specific policy type and fail the request may be with a 409 status and a clear error message indicating the existence of a duplicate entry?

@sentinelleader sentinelleader added kind/bug A bug triage/pending This issue will be looked at on the next triage meeting labels Feb 15, 2022
@lahabana
Copy link
Contributor

Triage: it would be nice to allow policies to have the same name across meshes.

2 ways to go about it:

  1. Disallow same name across mesh on universal (Breaking change)
  2. Find a way to map same name universal to k8s resource... (envoy _1_ style...)

Other better suggestions welcomed :)

@lahabana lahabana added kind/cleanup Cleanup/refactor an existing component/code triage/accepted The issue was reviewed and is complete enough to start working on it and removed triage/pending This issue will be looked at on the next triage meeting labels Feb 21, 2022
@github-actions github-actions bot added the triage/pending This issue will be looked at on the next triage meeting label Feb 21, 2022
@lahabana
Copy link
Contributor

Maybe 1 would work if it's enforced only on NEW resources until v2 where it's just disallowed

@lahabana lahabana removed the triage/pending This issue will be looked at on the next triage meeting label Feb 21, 2022
@github-actions github-actions bot added the triage/stale Inactive for some time. It will be triaged again label Mar 24, 2022
@github-actions
Copy link
Contributor

This issue was inactive for 30 days it will be reviewed in the next triage meeting and might be closed.
If you think this issue is still relevant please comment on it promptly or attend the next triage meeting.

@lahabana lahabana removed the triage/stale Inactive for some time. It will be triaged again label May 2, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Jun 2, 2022

This issue was inactive for 30 days it will be reviewed in the next triage meeting and might be closed.
If you think this issue is still relevant please comment on it promptly or attend the next triage meeting.

@github-actions github-actions bot added the triage/stale Inactive for some time. It will be triaged again label Jun 2, 2022
@lahabana lahabana removed the triage/stale Inactive for some time. It will be triaged again label Jun 2, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Jul 3, 2022

This issue was inactive for 30 days it will be reviewed in the next triage meeting and might be closed.
If you think this issue is still relevant please comment on it promptly or attend the next triage meeting.

@github-actions github-actions bot added the triage/stale Inactive for some time. It will be triaged again label Jul 3, 2022
@lahabana lahabana removed the triage/stale Inactive for some time. It will be triaged again label Jul 4, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Oct 3, 2022

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed.
If you think this issue is still relevant, please comment on it or attend the next triage meeting.

@github-actions github-actions bot added the triage/stale Inactive for some time. It will be triaged again label Oct 3, 2022
@lahabana lahabana removed the triage/stale Inactive for some time. It will be triaged again label Oct 3, 2022
@github-actions github-actions bot added the triage/stale Inactive for some time. It will be triaged again label Jan 2, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Jan 2, 2023

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed.
If you think this issue is still relevant, please comment on it or attend the next triage meeting.

@lahabana lahabana removed the triage/stale Inactive for some time. It will be triaged again label Jan 3, 2023
@github-actions github-actions bot added the triage/stale Inactive for some time. It will be triaged again label Apr 4, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Apr 4, 2023

This issue was inactive for 90 days. It will be reviewed in the next triage meeting and might be closed.
If you think this issue is still relevant, please comment on it or attend the next triage meeting.

@lahabana lahabana removed the triage/stale Inactive for some time. It will be triaged again label Apr 4, 2023
@lahabana
Copy link
Contributor

@lukidzi do you think there's something we could do when syncing from KDS?

@bartsmykla
Copy link
Contributor

xref: #7519

@jakubdyszkiewicz
Copy link
Contributor

Done by ^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug kind/cleanup Cleanup/refactor an existing component/code triage/accepted The issue was reviewed and is complete enough to start working on it
Projects
None yet
Development

No branches or pull requests

5 participants