HTTPRoute: allow more match clauses #3205

howardjohn · 2024-07-19T14:08:46Z

Today, a route is limited to 16 routes with 8 matches each. This is problematic in real world environments.

While its easy to split a route up (you can easily get the same behavior, and similar cognitive complexity by having 1 route per HTTPRoute), its quite hard to split up matches. For instance, https://github.com/istio/istio/blob/d7a9700d5eaa4e9728274b408623670f48deadb5/samples/bookinfo/gateway-api/bookinfo-gateway.yaml#L23-L38 is an example of a route matching the exposed APIs for a frontend. This is only a small tivial example, and it already uses 5. In our user base, we often see far beyond 8.

Splitting these up is both complex for the user, and may actually lead to different behaviors if implementations treat route groups differently (for instance, if we add retry budget -- that could now be split; other core or implementation specific policies may behave similarly, this is just an example).

The limit on 8 here seems quite small given the context of real usage, the other limits in the API (16 routes, 64 listeners), and the cost of manually working around it. Based on this, I propose we raise the limit to match Gateway listeners.

What type of PR is this?

/kind feature

Does this PR introduce a user-facing change?:

Increased the limit on HTTPRoute matches from 8 to 64.

Today, a route is limited to 16 routes with 8 matches each. This is problematic in real world environments. While its easy to split a route up (you can easily get the same behavior, and similar cognitive complexity by having 1 route per HTTPRoute), its quite hard to split up matches. For instance, https://github.com/istio/istio/blob/d7a9700d5eaa4e9728274b408623670f48deadb5/samples/bookinfo/gateway-api/bookinfo-gateway.yaml#L23-L38 is an example of a route matching the exposed APIs for a frontend. This is only a small tivial example, and it already uses 5. In our user base, we often see far beyond 8. Splitting these up is both complex for the user, and may actually lead to different behaviors if implementations treat route groups differently (for instance, if we add retry budget -- that could now be split; other core or implementation specific policies may behave similarly, this is just an example). The limit on 8 here seems quite small given the context of real usage, the other limits in the API (16 routes, 64 listeners), and the cost of manually working around it. Based on this, I propose we raise the limit to match Gateway listeners.

howardjohn · 2024-07-19T16:21:19Z

/retest

robscott · 2024-07-19T17:29:23Z

@howardjohn Historically we've been against this because of the worst case scenarios this could lead to if you max out the length of every list within a list (16 rules * 8 matchers * 16 filters == 2048 possible filters in a Route). Increasing the matchers from 8 to 64 would turn that into 16384, which would obviously be problematic.

With that said, we've got CEL now, so we could likely make a case for the following:

Allow up to 64 matchers in a Rule (that seems rather extreme, but something >8 does seem sensible)
Write CEL validation that ensures that the maximum number of matchers allowed in a Route does not go up from it's current theoretical max (128)

That would mean we're allowing significantly more flexibility here without actually increasing the max size of an individual route. If this ends up working reasonably well, we could apply the same principle elsewhere in the API.

howardjohn · 2024-07-19T17:40:51Z

For my understanding, what specifically are we trying to constrain with the overall size limit? Is it...

Not requiring implementations to support large routes
Staying below CEL limits
Making sure we fail hard before etcd size limits
Something else?

If its the CEL limits, is the concern that its within the complexity limits now (which already take into account the 16*64), or that it could in the future if matches get more complex?

I ask since it may impact the solution here

robscott · 2024-07-19T18:12:52Z

For my understanding, what specifically are we trying to constrain with the overall size limit? Is it...

Not requiring implementations to support large routes

Staying below CEL limits

Making sure we fail hard before etcd size limits

All of the above. I'd also mention that it's possible that we'll add more types of matchers inside this list which will only increase the overall size and complexity, I don't want to get so close to a max size that we end up having no additional room to grow. I'm also strongly biased towards more smaller routes than a few very large routes. In general though, when we're talking about API compatibility, this is already a GA API, and significantly loosening the validation would be problematic. I think we can make a reasonably strong case in favor of a change that allows for more flexibility but still fits within the original constraints though.

arkodg · 2024-07-19T18:14:05Z

+1 for this change, this will reduce the friction users are facing when migrating from exiting APIs to Gateway API, who are used to authoring rules a specific way, and this is currently slowing down Gateway API adoption.

@robscott's suggestion of adding a CEL validation to measure and limit total filters for a Route, is a great idea to account for the concerns of the size of the final object persisting in the API server

cross linking previous issues raised by users in the past

howardjohn · 2024-07-19T18:40:38Z

I've pushed up a change to limit the aggregate size. LMK what you think

robscott

Thanks @howardjohn!

apis/v1/httproute_types.go

robscott · 2024-07-19T21:22:10Z

Sadly looks like we're running into kubernetes/kubernetes#120973 - CRD is invalid because rule is too complex

howardjohn · 2024-07-19T21:39:37Z

That's only from the extra restriction fwiw. It works fine when we just allow it to be unbounded, even with the theoretical 16k entries.

howardjohn · 2024-07-19T21:41:18Z

Honestly don't get how that rule can possibly be 100x over the limit..

robscott

Thanks @howardjohn! Will defer to someone else for LGTM.

/approve

robscott · 2024-07-22T18:21:52Z

apis/v1/httproute_types.go

@@ -119,6 +119,7 @@ type HTTPRouteSpec struct {
 	// +optional
 	// +kubebuilder:validation:MaxItems=16
 	// +kubebuilder:default={{matches: {{path: {type: "PathPrefix", value: "/"}}}}}
+	// +kubebuilder:validation:XValidation:message="While 16 rules and 64 matches are allowed, the total matches must be less than 128",rule="(self.size() > 0 ? self[0].matches.size() : 0) + (self.size() > 1 ? self[1].matches.size() : 0) + (self.size() > 2 ? self[2].matches.size() : 0) + (self.size() > 3 ? self[3].matches.size() : 0) + (self.size() > 4 ? self[4].matches.size() : 0) + (self.size() > 5 ? self[5].matches.size() : 0) + (self.size() > 6 ? self[6].matches.size() : 0) + (self.size() > 7 ? self[7].matches.size() : 0) + (self.size() > 8 ? self[8].matches.size() : 0) + (self.size() > 9 ? self[9].matches.size() : 0) + (self.size() > 10 ? self[10].matches.size() : 0) + (self.size() > 11 ? self[11].matches.size() : 0) + (self.size() > 12 ? self[12].matches.size() : 0) + (self.size() > 13 ? self[13].matches.size() : 0) + (self.size() > 14 ? self[14].matches.size() : 0) + (self.size() > 15 ? self[15].matches.size() : 0) <= 128"


This is definitely gross, but the least bad we can offer until CEL cost estimation allows us to use map here (Kubernetes 1.30+). Here's the CEL playground link for future readers.

robscott · 2024-07-22T18:23:40Z

Actually, would really like @youngnick specifically to sign off on this, will add a hold until he's able to take a look.

/hold

gauravkghildiyal · 2024-07-22T19:20:07Z

LGTM

Deferring review to Nick as per #3205 (comment)

youngnick · 2024-07-23T06:18:19Z

As long as the overall complexity isn't increased, this makes sense to me. Nice use of CEL to make sure that's the case.

/lgtm

apis/v1/httproute_types.go

sunjayBhatia

should we do the same for GRPCRoute as well while we're making this change?

robscott · 2024-07-23T20:45:37Z

should we do the same for GRPCRoute as well while we're making this change?

I think updating GRPCRoute to match would be great. Would approve in this PR or a follow up.

howardjohn · 2024-07-29T20:48:37Z

Added gRPC route as well

robscott · 2024-07-29T20:54:40Z

Thanks @howardjohn!

/lgtm
/approve

robscott · 2024-07-29T21:00:53Z

I think we've got enough other LGTMs on this one to remove the hold.

/hold cancel

sunjayBhatia

/lgtm

k8s-ci-robot · 2024-07-29T23:30:55Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: howardjohn, robscott, sunjayBhatia

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [robscott]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

* HTTPRoute: allow more match clauses Today, a route is limited to 16 routes with 8 matches each. This is problematic in real world environments. While its easy to split a route up (you can easily get the same behavior, and similar cognitive complexity by having 1 route per HTTPRoute), its quite hard to split up matches. For instance, https://github.com/istio/istio/blob/d7a9700d5eaa4e9728274b408623670f48deadb5/samples/bookinfo/gateway-api/bookinfo-gateway.yaml#L23-L38 is an example of a route matching the exposed APIs for a frontend. This is only a small tivial example, and it already uses 5. In our user base, we often see far beyond 8. Splitting these up is both complex for the user, and may actually lead to different behaviors if implementations treat route groups differently (for instance, if we add retry budget -- that could now be split; other core or implementation specific policies may behave similarly, this is just an example). The limit on 8 here seems quite small given the context of real usage, the other limits in the API (16 routes, 64 listeners), and the cost of manually working around it. Based on this, I propose we raise the limit to match Gateway listeners. * Limit aggregate size * Drop to 128 and add tests * hacky * Unroll the loop * gRPCRoute and update comment * Fix grpcroute

howardjohn · 2024-12-18T03:14:32Z

The grpc implementation is incorrect here - the limit was never raised.

k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. release-note Denotes a PR that will be considered when it comes time to generate release notes. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 19, 2024

k8s-ci-robot requested review from robscott and thockin July 19, 2024 14:08

k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jul 19, 2024

Limit aggregate size

fb2441b

robscott reviewed Jul 19, 2024

View reviewed changes

apis/v1/httproute_types.go Outdated Show resolved Hide resolved

Drop to 128 and add tests

3174940

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 19, 2024

This was referenced Jul 19, 2024

CEL Cost Calculation Overestimates Cost of Sum in CRD Validation kubernetes/kubernetes#126239

Closed

CEL: Need Test Coverage against Multiple Kubernetes Versions #3206

Closed

howardjohn added 2 commits July 22, 2024 10:52

hacky

b878ddb

Unroll the loop

0c506ff

robscott reviewed Jul 22, 2024

View reviewed changes

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 22, 2024

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 22, 2024

k8s-ci-robot assigned youngnick Jul 23, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 23, 2024

sunjayBhatia reviewed Jul 23, 2024

View reviewed changes

apis/v1/httproute_types.go Outdated Show resolved Hide resolved

sunjayBhatia reviewed Jul 23, 2024

View reviewed changes

gRPCRoute and update comment

3d68856

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 29, 2024

k8s-ci-robot assigned robscott Jul 29, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 29, 2024

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 29, 2024

Fix grpcroute

9c1dce5

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 29, 2024

sunjayBhatia approved these changes Jul 29, 2024

View reviewed changes

k8s-ci-robot assigned sunjayBhatia Jul 29, 2024

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 29, 2024

k8s-ci-robot merged commit 397f8c9 into kubernetes-sigs:main Jul 29, 2024
8 checks passed

arkodg mentioned this pull request Sep 19, 2024

HTTPRouteConfig file has a limitation for number of items in the Match object envoyproxy/gateway#396

Closed

howardjohn mentioned this pull request Dec 18, 2024

The grpc implementation is incorrect here - the limit was never raised. #3511

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTPRoute: allow more match clauses #3205

HTTPRoute: allow more match clauses #3205

howardjohn commented Jul 19, 2024

howardjohn commented Jul 19, 2024

robscott commented Jul 19, 2024

howardjohn commented Jul 19, 2024

robscott commented Jul 19, 2024

arkodg commented Jul 19, 2024 •

edited

Loading

howardjohn commented Jul 19, 2024

robscott left a comment

robscott commented Jul 19, 2024

howardjohn commented Jul 19, 2024

howardjohn commented Jul 19, 2024

robscott left a comment

robscott Jul 22, 2024

robscott commented Jul 22, 2024

gauravkghildiyal commented Jul 22, 2024

youngnick commented Jul 23, 2024

sunjayBhatia left a comment

robscott commented Jul 23, 2024

howardjohn commented Jul 29, 2024

robscott commented Jul 29, 2024

robscott commented Jul 29, 2024

sunjayBhatia left a comment

k8s-ci-robot commented Jul 29, 2024

howardjohn commented Dec 18, 2024

HTTPRoute: allow more match clauses #3205

HTTPRoute: allow more match clauses #3205

Conversation

howardjohn commented Jul 19, 2024

howardjohn commented Jul 19, 2024

robscott commented Jul 19, 2024

howardjohn commented Jul 19, 2024

robscott commented Jul 19, 2024

arkodg commented Jul 19, 2024 • edited Loading

howardjohn commented Jul 19, 2024

robscott left a comment

Choose a reason for hiding this comment

robscott commented Jul 19, 2024

howardjohn commented Jul 19, 2024

howardjohn commented Jul 19, 2024

robscott left a comment

Choose a reason for hiding this comment

robscott Jul 22, 2024

Choose a reason for hiding this comment

robscott commented Jul 22, 2024

gauravkghildiyal commented Jul 22, 2024

youngnick commented Jul 23, 2024

sunjayBhatia left a comment

Choose a reason for hiding this comment

robscott commented Jul 23, 2024

howardjohn commented Jul 29, 2024

robscott commented Jul 29, 2024

robscott commented Jul 29, 2024

sunjayBhatia left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Jul 29, 2024

howardjohn commented Dec 18, 2024

arkodg commented Jul 19, 2024 •

edited

Loading