-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter endpoint in LB based on metadata #1279
Comments
cc @zuercher too |
Sure. It's going to be a few weeks until we've fully plumbed in protos into the CDS/EDS/RDS aspects of Envoy, was just trying to get a provisional notion of who might own this one. Will assign to @zuercher for now. |
our general focus will be on implementing the xDS APIs with our stuff, but this is definitely something we'll need, so we can sign up for it if nobody else needs it sooner. |
Would also be nice to have fallback support , i.e. if no endpoints with the given labels are found, load balance among all endpoints. And the ability to derive the metadata from request headers. |
I'm starting to look at this. I think a reasonable first step is to plumb the LbEndpoint Metadata from EDS into Upstream::HostDescription/Impl. Then it should be possible to use the data from inside the load balancers. |
Ready to take the next steps. This is a proposal for some further modifications (include a change to data-plane-api). It seems reasonable to me not to have to maintain alternate load balancer implementations (there are currently 5) that differ only by being configurable with subsets. So I propose to
Add subset configuration to the Cluster message in cds.proto: // Optionally divide the endpoints in this cluster into subsets defined by endpoint
// metadata.
message LbSubsetConfig {
// The behavior used when no endpoint subset matches the LoadBalancerContext.
// The options are no_fallback, any_endpoint, or default_subset. The value defaults
// to no_fallback. If no_fallback is selected, a result equivalent to no healthy
// hosts is reported. If any_endpoint is selected, any cluster endpoint may be
// returned (subject to policy, health checks, etc). If default_subset is selected,
// the load balancing is performed over the hosts matching the values in
// default_subset.
enum LbSubsetFallbackPolicy {
NO_FALLBACK = 0;
ANY_ENDPOINT = 1;
DEFAULT_SUBSET = 2;
}
LbSubsetFallbackPolicy fallback_policy = 1;
// Specifies the default subset used during fallback if fallback_policy is
// default_subset. Each field in default_subset is compared to the matching
// LbEndpoint.Metadata under the "envoy.lb" namespace. It is valid for no hosts to
// match, in which case the behavior is the same as a fallback_policy of
// no_fallback.
google.protobuf.Struct default_subset = 2;
// Specifications for subsets. For each entry, LbEndpoint.Metadata's "envoy.lb"
// namespace is traversed and a subset is created for each unique combination of
// key and value. For example:
// { "subset_keys": [
// { "keys": [ "version" ] },
// { "keys": [ "stage", "hardware_type" ] }
// ]}
// Subsets may overlap. In the case of overlapping subsets, the first matching
// subset is selected.
message LbSubsetKeys {
repeated string keys = 1;
}
repeated LbSubsetKeys subset_keys = 3;
}
LbSubsetConfig lb_subset_config = 22;
|
We've discussed making LB's pluggable like filters before @rshriram. We could add an opaque config for these so we don't need to keep cramming stuff into CDS. |
I'm not entirely sure I follow. Is this just adding
to the Cluster message and placing the subset config stuff in a separate file? |
@htuch I agree we should allow for pluggable LBs like we are now allowing for filters, stats, etc. but in this case I see no reason not to add the subsetting support directly mainly because as @zuercher points out it can be done in a way that does not change existing behavior. @zuercher ^ sounds reasonable/useful to me. My only concern is actually perf. It's not great if we have to compare via reflection the struct from the route to the struct from the subset during LB. I think we are going to have to pre-parse somehow which we should be able to do when the route loads as well as when the LB refreshes. I'm not exactly sure how this would work though. @htuch any thoughts on how to best do this? |
I expect that the structs will usually be flat (e.g., the field values will be string/bool/number and not nested structs or lists). I'm fairly new to protobufs, but it looks like it's possible to treat the struct as map of field names to values where each value has a kind that determines how to extract the typed value. Nested lists and structs would be more expensive since you have to traverse the hierarchy. That said, I think it makes sense to compute hashes to avoid doing many comparisons. And just to clarify, I had been thinking that there might be more fields in the structs from the route than in the subset struct in the LB. For example, the route might specify {"version":"123", "stage":"prod"} where the subset struct only includes {"version":"123"}. In this case I'd expect the subset to be used. So the hashing would be one level down and there'd likely be a new type used to represent the route and subset structs to handle this. |
Yeah, I think we're going to have to invent our own thing here, protobuf's internal representation of Struct is verbose. I agree that flat hashing is what we want for the common case. One idea is that we could build an index hashtable for each LB subset query we encounter as well, so we only pay this penalty once. |
Ok. I'll put up a review for the data-plane-api change and then get started on the actual implementation. |
Provides a configuration point for a load balancer that can divide endpoints into subsets addressable by route metadata. See discussion in envoyproxy/envoy#1279. Signed-off-by: Stephan Zuercher <[email protected]>
Fixes #1279, and is the final part of splitting #1735. Provides the actual load balancer for subsets with basic stats for the number of subsets active, created, removed, selected (used for host selection), and fallbacks (used fallback path). Signed-off-by: Stephan Zuercher <[email protected]>
As decided in https://github.com/lyft/envoy-api/issues/81, we added metadata support to EDS and matching in routes to the API in envoyproxy/data-plane-api@abcd5ba. This issue will track the implementation work around endpoint subsetting in the LB stage.
@mccv @9len @rshriram
The text was updated successfully, but these errors were encountered: