Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/loadbalancing] feat(lb): Introduce the ability to load balance on composite keys in lb #36567

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
change_type: enhancement
component: exporter/loadbalancing
note: Add support for route with composite keys
issues: [35320]
4 changes: 3 additions & 1 deletion exporter/loadbalancingexporter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -111,11 +111,13 @@ Refer to [config.yaml](./testdata/config.yaml) for detailed examples on using th
* This resolver currently returns a maximum of 100 hosts.
* `TODO`: Feature request [29771](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/29771) aims to cover the pagination for this scenario
* The `routing_key` property is used to specify how to route values (spans or metrics) to exporters based on different parameters. This functionality is currently enabled only for `trace` and `metric` pipeline types. It supports one of the following values:
* `service`: Routes values based on their service name. This is useful when using processors like the span metrics, so all spans for each service are sent to consistent collector instances for metric collection. Otherwise, metrics for the same services are sent to different collectors, making aggregations inaccurate.
* `service`: Routes values based on their service name. This is useful when using processors like the span metrics, so all spans for each service are sent to consistent collector instances for metric collection. Otherwise, metrics for the same services are sent to different collectors, making aggregations inaccurate. In addition to resource / span attributes, `span.kind`, `span.name` (the top level properties of a span) are also supported.
* `attributes`: Routes based on values in the attributes of the traces. This is similar to service, but useful for situations in which a single service overwhelms any given instance of the collector, and should be split over multiple collectors.
* `traceID`: Routes spans based on their `traceID`. Invalid for metrics.
* `metric`: Routes metrics based on their metric name. Invalid for spans.
* `streamID`: Routes metrics based on their datapoint streamID. That's the unique hash of all it's attributes, plus the attributes and identifying information of its resource, scope, and metric data
* loadbalancing exporter supports set of standard [queuing, retry and timeout settings](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md), but they are disable by default to maintain compatibility
* The `routing_attributes` property is used to list the attributes that should be used if the `routing_key` is `attributes`.

Simple example

Expand Down
16 changes: 13 additions & 3 deletions exporter/loadbalancingexporter/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ const (
metricNameRouting
resourceRouting
streamIDRouting
attrRouting
)

const (
Expand All @@ -28,6 +29,7 @@ const (
metricNameRoutingStr = "metric"
resourceRoutingStr = "resource"
streamIDRoutingStr = "streamID"
attrRoutingStr = "attributes"
)

// Config defines configuration for the exporter.
Expand All @@ -36,9 +38,17 @@ type Config struct {
configretry.BackOffConfig `mapstructure:"retry_on_failure"`
QueueSettings exporterhelper.QueueConfig `mapstructure:"sending_queue"`

Protocol Protocol `mapstructure:"protocol"`
Resolver ResolverSettings `mapstructure:"resolver"`
RoutingKey string `mapstructure:"routing_key"`
Protocol Protocol `mapstructure:"protocol"`
Resolver ResolverSettings `mapstructure:"resolver"`

// RoutingKey is a single routing key value
RoutingKey string `mapstructure:"routing_key"`

// RoutingAttributes creates a composite routing key, based on several resource attributes of the application.
//
// Supports all attributes available (both resource and span), as well as the pseudo attributes "span.kind" and
// "span.name".
RoutingAttributes []string `mapstructure:"routing_attributes"`
}

// Protocol holds the individual protocol-specific settings. Only OTLP is supported at the moment.
Expand Down
94 changes: 83 additions & 11 deletions exporter/loadbalancingexporter/trace_exporter.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import (
"context"
"errors"
"fmt"
"strings"
"sync"
"time"

Expand All @@ -22,13 +23,19 @@ import (
"github.com/open-telemetry/opentelemetry-collector-contrib/pkg/batchpersignal"
)

const (
pseudoAttrSpanName = "span.name"
pseudoAttrSpanKind = "span.kind"
)

var _ exporter.Traces = (*traceExporterImp)(nil)

type exporterTraces map[*wrappedExporter]ptrace.Traces

type traceExporterImp struct {
loadBalancer *loadBalancer
routingKey routingKey
routingAttrs []string

stopped bool
shutdownWg sync.WaitGroup
Expand Down Expand Up @@ -64,6 +71,9 @@ func newTracesExporter(params exporter.Settings, cfg component.Config) (*traceEx
switch cfg.(*Config).RoutingKey {
case svcRoutingStr:
traceExporter.routingKey = svcRouting
case attrRoutingStr:
traceExporter.routingKey = attrRouting
traceExporter.routingAttrs = cfg.(*Config).RoutingAttributes
case traceIDRoutingStr, "":
default:
return nil, fmt.Errorf("unsupported routing_key: %s", cfg.(*Config).RoutingKey)
Expand Down Expand Up @@ -92,7 +102,7 @@ func (e *traceExporterImp) ConsumeTraces(ctx context.Context, td ptrace.Traces)
exporterSegregatedTraces := make(exporterTraces)
endpoints := make(map[*wrappedExporter]string)
for _, batch := range batches {
routingID, err := routingIdentifiersFromTraces(batch, e.routingKey)
routingID, err := routingIdentifiersFromTraces(batch, e.routingKey, e.routingAttrs)
if err != nil {
return err
}
Expand Down Expand Up @@ -133,7 +143,15 @@ func (e *traceExporterImp) ConsumeTraces(ctx context.Context, td ptrace.Traces)
return errs
}

func routingIdentifiersFromTraces(td ptrace.Traces, key routingKey) (map[string]bool, error) {
// routingIdentifiersFromTraces reads the traces and determines an identifier that can be used to define a position on the
// ring hash. It takes the routingKey, defining what type of routing should be used, and a series of attributes
// (optionally) used if the routingKey is attrRouting.
//
// only svcRouting and attrRouting are supported. For attrRouting, any attribute, as well the "pseudo" attributes span.name
// and span.kind are supported.
//
// In practice, makes the assumption that ptrace.Traces includes only one trace of each kind, in the "trace tree".
atoulme marked this conversation as resolved.
Show resolved Hide resolved
func routingIdentifiersFromTraces(td ptrace.Traces, rType routingKey, attrs []string) (map[string]bool, error) {
ids := make(map[string]bool)
rs := td.ResourceSpans()
if rs.Len() == 0 {
Expand All @@ -149,18 +167,72 @@ func routingIdentifiersFromTraces(td ptrace.Traces, key routingKey) (map[string]
if spans.Len() == 0 {
return nil, errors.New("empty spans")
}
// Determine how the key should be populated.
switch rType {
case traceIDRouting:
// The simple case is the TraceID routing. In this case, we just use the string representation of the Trace ID.
tid := spans.At(0).TraceID()
ids[string(tid[:])] = true

if key == svcRouting {
for i := 0; i < rs.Len(); i++ {
svc, ok := rs.At(i).Resource().Attributes().Get("service.name")
if !ok {
return nil, errors.New("unable to get service name")
return ids, nil
case svcRouting:
// Service Name is still handled as an "attribute router", it's just expressed as a "pseudo attribute"
attrs = []string{"service.name"}
case attrRouting:
// By default, we'll just use the input provided.
break
default:
return nil, fmt.Errorf("unsupported routing_key: %d", rType)
}

// Composite the attributes together as a key.
for i := 0; i < rs.Len(); i++ {
// rKey will never return an error. See
// 1. https://pkg.go.dev/bytes#Buffer.Write
// 2. https://stackoverflow.com/a/70388629
var rKey strings.Builder

for _, a := range attrs {
atoulme marked this conversation as resolved.
Show resolved Hide resolved
// resource spans
rAttr, ok := rs.At(i).Resource().Attributes().Get(a)
if ok {
rKey.WriteString(rAttr.Str())
continue
}

// ils or "instrumentation library spans"
ils := rs.At(0).ScopeSpans()
iAttr, ok := ils.At(0).Scope().Attributes().Get(a)
if ok {
rKey.WriteString(iAttr.Str())
continue
}

// the lowest level span (or what engineers regularly interact with)
spans := ils.At(0).Spans()

if a == pseudoAttrSpanKind {
rKey.WriteString(spans.At(0).Kind().String())
atoulme marked this conversation as resolved.
Show resolved Hide resolved

continue
}

if a == pseudoAttrSpanName {
rKey.WriteString(spans.At(0).Name())

continue
}

sAttr, ok := spans.At(0).Attributes().Get(a)
if ok {
rKey.WriteString(sAttr.Str())
continue
}
ids[svc.Str()] = true
}
return ids, nil

// No matter what, there will be a key here (even if that key is "").
ids[rKey.String()] = true
}
tid := spans.At(0).TraceID()
ids[string(tid[:])] = true

return ids, nil
}
Loading
Loading