newSizedChannel does not properly close on exporter shutdown #11401

Tarmander · 2024-10-08T17:26:13Z

Describe the bug
This is related to an issue with the exporter/loadbalancingexporter. The k8s resolver would continuously Shutdown() and create 2 new boundedMemoryQueue's every time the endpoints were "updated" (roughly every 3 minutes). This behavior went unnoticed until the Memory Limiter Processor started to drop spans.

After investigation with the pprof extension, we realized that we had an unbounded memory leak, and the root cause was that each time an exporter and subsequent queue were shutdown, the underlying channel was not GC'd properly. We'd continue allocating a new channel each update until we ran OOM.

Steps to reproduce
Configure the k8s resolver to point to a service with many endpoints (the more endpoints the quicker the memory increase). You can run with the pprof extension to see the memory increase in newSizedChannel over time.

What did you expect to see?
All exporters/queues/channels to be properly Shutdown() and GC'd.

What did you see instead?
Channels in existing exporter queues were not disposed of, and eventually they used up all resources in the pod.

What version did you use?
v0.105.0

What config did you use?

receivers:
  otlp:
    protocols:
      grpc: { }
      http: { }
processors:
  batch:
    timeout: 1s
  memory_limiter:
    check_interval: 5s
    limit_percentage: 80
    spike_limit_percentage: 20
exporters:
  loadbalancing:
    protocol:
      otlp:
        tls:
          insecure: true
        sending_queue:
          queue_size: 100000
          num_consumers: 25
    resolver:
      k8s:
        service: opentelemetry-global-gateway-collector-headless.opentelemetry-global-collector
extensions:
  health_check:
    endpoint: 0.0.0.0:13133
  zpages:
    endpoint: 0.0.0.0:55679
  pprof:
    endpoint: localhost:1777
service:
  extensions: [ health_check, zpages, pprof ]
  telemetry:
    logs:
      level: info
      encoding: json
    metrics:
      address: 0.0.0.0:8888
  pipelines:
    traces:
      receivers: [ otlp ]
      processors: [ memory_limiter, batch ]
      exporters: [ loadbalancing ]

Environment
OS: Ubuntu 22.04
Compiler: go1.22.6
kubenertes version: apiVersion: opentelemetry.io/v1beta1

The text was updated successfully, but these errors were encountered:

madaraszg-tulip · 2024-11-20T09:57:01Z

I have noticed a very similar issue using the loadbalancing exporter in grafana alloy which uses this component. Here are some pyroscope screenshots:

madaraszg-tulip · 2024-11-25T11:51:11Z

https://github.com/open-telemetry/opentelemetry-collector/blob/v0.114.0/exporter/exporterhelper/internal/metadata/generated_telemetry.go#L58-L92

	_, err = builder.meter.RegisterCallback(func(_ context.Context, o metric.Observer) error {
		o.ObserveInt64(builder.ExporterQueueCapacity, cb(), opts...)
		return nil
	}, builder.ExporterQueueCapacity)

The return values of the RegisterCallback functions are ignored. They hold references that can be used to unregister these callbacks when the exporter is being shut down. I assume they should be returned to the caller exporter/exporterhelper/internal/queue_sender Start(), to be later used in Shutdown()

#### Description Fix memory leak at exporter shutdown. At startup time the exporter creates metric callbacks that hold references to the exporter queue, these need to be unregistered at shutdown.  #### Link to tracking issue Fixes #11401 --------- Co-authored-by: Alex Boten <[email protected]>

#### Description Fix memory leak at exporter shutdown. At startup time the exporter creates metric callbacks that hold references to the exporter queue, these need to be unregistered at shutdown.  #### Link to tracking issue Fixes open-telemetry#11401 --------- Co-authored-by: Alex Boten <[email protected]>

Tarmander added the bug Something isn't working label Oct 8, 2024

madaraszg-tulip mentioned this issue Nov 25, 2024

Remove memory leak at exporter shutdown #11745

Merged

codeboten closed this as completed in #11745 Dec 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

newSizedChannel does not properly close on exporter shutdown #11401

newSizedChannel does not properly close on exporter shutdown #11401

Tarmander commented Oct 8, 2024

madaraszg-tulip commented Nov 20, 2024

madaraszg-tulip commented Nov 25, 2024 •

edited

Loading

newSizedChannel does not properly close on exporter shutdown #11401

newSizedChannel does not properly close on exporter shutdown #11401

Comments

Tarmander commented Oct 8, 2024

madaraszg-tulip commented Nov 20, 2024

madaraszg-tulip commented Nov 25, 2024 • edited Loading

madaraszg-tulip commented Nov 25, 2024 •

edited

Loading