Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MeshTimeout inconsistent default values for spec.to depending on the presence of the inbound policy #12649

Open
lobkovilya opened this issue Jan 23, 2025 · 0 comments
Labels
kind/bug A bug triage/accepted The issue was reviewed and is complete enough to start working on it
Milestone

Comments

@lobkovilya
Copy link
Contributor

What happened?

Problem

When there are no MeshTimeout then plugin is not executed

policies, ok := proxy.Policies.Dynamic[api.MeshTimeoutType]
if !ok {
return nil
}
if len(policies.ToRules.Rules) == 0 && len(policies.FromRules.Rules) == 0 && len(policies.GatewayRules.ToRules.ByListener) == 0 && len(policies.ToRules.ResourceRules) == 0 {
return nil
}

But when there are MeshTimeouts that configure spec.from then spec.to is configured with default values.

Steps to reproduce

  1. Apply shadow MeshTimeout that configures spec.from
type: MeshTimeout
name: mt-1
mesh: meshtimeout-envoyconfig
labels:
  kuma.io/effect: shadow
spec:
  from:
    - targetRef:
        kind: Mesh
      default:
        idleTimeout: 50s
        connectionTimeout: 51s
        http:
          requestTimeout: 52s
          streamIdleTimeout: 53s
          maxStreamDuration: 54s
          maxConnectionDuration: 55s
  1. Run
kumactl inspect dataplane $name --type=config --shadow --include=diff | jq '.diff'
[
  {
    "op": "test",
    "path": "/type.googleapis.com~1envoy.config.cluster.v3.Cluster/localhost:3000/connectTimeout",
    "value": "10s"
  },
  {
    "op": "remove",
    "path": "/type.googleapis.com~1envoy.config.cluster.v3.Cluster/localhost:3000/connectTimeout",
    "value": "10s"
  },
  {
    "op": "add",
    "path": "/type.googleapis.com~1envoy.config.cluster.v3.Cluster/localhost:3000/connectTimeout",
    "value": "51s"
  },
  {
    "op": "add",
    "path": "/type.googleapis.com~1envoy.config.cluster.v3.Cluster/meshtimeout-envoyconfig_demo-client__kuma-3_msvc_3000/connectTimeout",
    "value": "5s"
  },
  {
    "op": "add",
    "path": "/type.googleapis.com~1envoy.config.cluster.v3.Cluster/meshtimeout-envoyconfig_test-server__kuma-3_msvc_80/typedExtensionProtocolOptions/envoy.extensions.upstreams.http.v3.HttpProtocolOptions/commonHttpProtocolOptions",
    "value": {
      "idleTimeout": "3600s",
      "maxConnectionDuration": "5s",
      "maxStreamDuration": "0s"
    }
  },
  {
    "op": "add",
    "path": "/type.googleapis.com~1envoy.config.cluster.v3.Cluster/meshtimeout-envoyconfig_test-server__kuma-3_msvc_80/connectTimeout",
    "value": "5s"
  },
  {
    "op": "test",
    "path": "/type.googleapis.com~1envoy.config.listener.v3.Listener/inbound:IP_REDACTED:3000/filterChains/0/filters/0/typedConfig/idleTimeout",
    "value": "7200s"
  },
  {
    "op": "remove",
    "path": "/type.googleapis.com~1envoy.config.listener.v3.Listener/inbound:IP_REDACTED:3000/filterChains/0/filters/0/typedConfig/idleTimeout",
    "value": "7200s"
  },
  {
    "op": "add",
    "path": "/type.googleapis.com~1envoy.config.listener.v3.Listener/inbound:IP_REDACTED:3000/filterChains/0/filters/0/typedConfig/idleTimeout",
    "value": "50s"
  },
  {
    "op": "add",
    "path": "/type.googleapis.com~1envoy.config.listener.v3.Listener/outbound:IP_REDACTED:3000/filterChains/0/filters/0/typedConfig/idleTimeout",
    "value": "3600s"
  },
  {
    "op": "test",
    "path": "/type.googleapis.com~1envoy.config.listener.v3.Listener/outbound:IP_REDACTED:80/filterChains/0/filters/0/typedConfig/routeConfig/virtualHosts/0/routes/0/route/timeout",
    "value": "0s"
  },
  {
    "op": "remove",
    "path": "/type.googleapis.com~1envoy.config.listener.v3.Listener/outbound:IP_REDACTED:80/filterChains/0/filters/0/typedConfig/routeConfig/virtualHosts/0/routes/0/route/timeout",
    "value": "0s"
  },
  {
    "op": "add",
    "path": "/type.googleapis.com~1envoy.config.listener.v3.Listener/outbound:IP_REDACTED:80/filterChains/0/filters/0/typedConfig/routeConfig/virtualHosts/0/routes/0/route/timeout",
    "value": "15s"
  },
  {
    "op": "add",
    "path": "/type.googleapis.com~1envoy.config.listener.v3.Listener/outbound:IP_REDACTED:80/filterChains/0/filters/0/typedConfig/routeConfig/virtualHosts/0/routes/0/route/idleTimeout",
    "value": "5s"
  },
  {
    "op": "add",
    "path": "/type.googleapis.com~1envoy.config.listener.v3.Listener/outbound:IP_REDACTED:80/filterChains/0/filters/0/typedConfig/commonHttpProtocolOptions",
    "value": {
      "idleTimeout": "0s"
    }
  },
  {
    "op": "add",
    "path": "/type.googleapis.com~1envoy.config.listener.v3.Listener/outbound:IP_REDACTED:80/filterChains/0/filters/0/typedConfig/requestHeadersTimeout",
    "value": "0s"
  }
]

Expected behaviour: no outbound listeners and clusters are affected
Actual behaviour: outbound listeners and clusters have different timeouts

Effect

Some of these values are just explicitly set to defaults that Envoy has, it's harmless:

{
 "op": "add",
 "path": "/type.googleapis.com~1envoy.config.listener.v3.Listener/outbound:IP_REDACTED:3000/filterChains/0/filters/0/typedConfig/idleTimeout",
 "value": "3600s"
}

but changes like

[
  {
   "op": "remove",
   "path": "/type.googleapis.com~1envoy.config.listener.v3.Listener/outbound:IP_REDACTED:80/filterChains/0/filters/0/typedConfig/routeConfig/virtualHosts/0/routes/0/route/timeout",
   "value": "0s"
  },
  {
   "op": "add",
   "path": "/type.googleapis.com~1envoy.config.listener.v3.Listener/outbound:IP_REDACTED:80/filterChains/0/filters/0/typedConfig/routeConfig/virtualHosts/0/routes/0/route/timeout",
   "value": "15s"
  }
]

are very dangerous and enabling 15s timeout without the user expecting it.

@lobkovilya lobkovilya added kind/bug A bug triage/pending This issue will be looked at on the next triage meeting labels Jan 23, 2025
lobkovilya added a commit that referenced this issue Jan 27, 2025
## Motivation

When we test `BuildRules` we pass MeshTimeout policies as input and
check `outbound.ResourceRules` Go struct.

When we test MeshTimeout plugin we pass `outbound.ResourceRules` Go
struct as input and check envoy configs.

We need an easy way to pass MeshTimeout policies and check envoy
configs. It's sort of an integration test that involves 2 packages. The
easiest way to do that is to add another universal e2e test with golden
files. This way we have running Kuma CP and we basically test it as a
black box.

I managed to find 2 bugs using these tests:
* #12648
* #12649

## Implementation information

We check DPP `config` (marshalled Snapshot in Kuma CP) instead of the
actual Envoy config dump for a few reasons:
* no need to add logic that waits while config is propagated on DPP
* less setup-specific values we have to reduct to have stable golden
file (only IP addresses)
* we can use `kuma.io/effect: shadow` and quickly review the `diff`
policies introduces

---------

Signed-off-by: Ilya Lobkov <[email protected]>
@lukidzi lukidzi added triage/accepted The issue was reviewed and is complete enough to start working on it and removed triage/pending This issue will be looked at on the next triage meeting labels Jan 27, 2025
@lukidzi lukidzi added this to the 2.10.x milestone Jan 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug triage/accepted The issue was reviewed and is complete enough to start working on it
Projects
None yet
Development

No branches or pull requests

2 participants