Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[telemetrySettting] Create sampled Logger #8134

Merged
merged 35 commits into from
Sep 29, 2023

Conversation

antonjim-te
Copy link
Contributor

@antonjim-te antonjim-te commented Jul 26, 2023

Issue:

In one of the use cases which we are working on, we have identified that sampledLogger mechanism used in the Opentelemetry exporters (otlpexporter and otlphttpexporter), could become a memory bottleneck, when the number of exports in the same pipeline is high (we are talking about 100s of instances of exporters running on the same machine).

The sampledLogger is created inside the exporterhelper module and the sampling is always activated unless the collector is in debug mode:

func newQueuedRetrySender(id component.ID, signal component.DataType, qCfg QueueSettings, rCfg RetrySettings, reqUnmarshaler internal.RequestUnmarshaler, nextSender requestSender, logger *zap.Logger) *queuedRetrySender {
	retryStopCh := make(chan struct{})
	sampledLogger := createSampledLogger(logger)
func createSampledLogger(logger *zap.Logger) *zap.Logger {
	if logger.Core().Enabled(zapcore.DebugLevel) {
		// Debugging is enabled. Don't do any sampling.
		return logger
	}

	// Create a logger that samples all messages to 1 per 10 seconds initially,
	// and 1/100 of messages after that.
	opts := zap.WrapCore(func(core zapcore.Core) zapcore.Core {
		return zapcore.NewSamplerWithOptions(
			core,
			10*time.Second,
			1,
			100,
		)
	})
	return logger.WithOptions(opts)
}

Investigation:

The high memory usage has been discovered using https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/extension/pprofextension

  • In the first scenario, default configuration with the sampledLogger active, based on the profiled data we could observe that sampling mechanism was responsible for allocating aprox. 75% of the memory of the running collector:
    image

  • In the second scenario, with the same number of running exporters, but with the sampledLogger mechanism disabled (using debug level configuration check configuration below), we were able to observe a much smaller memory footprint:

    service:
      telemetry:
        logs:
          level: debug

image

Given that the logging sampling mechanism could become a bottleneck in certain scenarios (our case), we would like to propose a contribution, which potentially can improve the memory footprint of the exporters in certain use cases (potentially making it more scalable)

Proposal:

Create an extra sampled logger in the telemetry configuration used to avoid flooding the logs with messages that are repeated frequently

The sampled logger is configured by LogsSamplingConfig

service:
    telemetry:
        logs:
            sampling:

Default configuration for the sampled logger

service:
    telemetry:
        logs:
            sampling:
	        initial: 1
	        thereafter: 100
	        tick: 10s

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jul 26, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

exporter/exporterhelper/README.md Outdated Show resolved Hide resolved
exporter/exporterhelper/common.go Outdated Show resolved Hide resolved
exporter/exporterhelper/common.go Outdated Show resolved Hide resolved
exporter/exporterhelper/queued_retry.go Outdated Show resolved Hide resolved
exporter/exporterhelper/queued_retry_test.go Outdated Show resolved Hide resolved
exporter/otlphttpexporter/README.md Outdated Show resolved Hide resolved
@antonjim-te antonjim-te requested a review from xvilaca-te July 26, 2023 13:55
@antonjim-te
Copy link
Contributor Author

@antonjim-te antonjim-te marked this pull request as ready for review July 27, 2023 15:05
@antonjim-te antonjim-te requested review from a team, bogdandrutu and xvilaca-te and removed request for xvilaca-te July 27, 2023 15:05
exporter/exporterhelper/common.go Outdated Show resolved Hide resolved
@antonjim-te
Copy link
Contributor Author

Hi @bogdandrutu, could you take a look at the PR please?
Thank you.

@github-actions
Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @antonjim-te. It looks good to me. Just minor comments

.chloggen/SampledLoggerTelemetry.yaml Outdated Show resolved Hide resolved
otelcol/unmarshaler.go Outdated Show resolved Hide resolved
service/telemetry/telemetry_test.go Show resolved Hide resolved
service/telemetry/telemetry.go Outdated Show resolved Hide resolved
service/telemetry/config.go Outdated Show resolved Hide resolved
service/telemetry/config.go Outdated Show resolved Hide resolved
.chloggen/SampledLoggerTelemetry.yaml Outdated Show resolved Hide resolved
@dmitryax
Copy link
Member

I'd like to have another look from any of @open-telemetry/collector-approvers before merging

@dmitryax dmitryax merged commit 7c5ecef into open-telemetry:main Sep 29, 2023
@github-actions github-actions bot added this to the next release milestone Sep 29, 2023
dmitryax added a commit to open-telemetry/opentelemetry-collector-contrib that referenced this pull request Oct 5, 2023
dmitryax added a commit to open-telemetry/opentelemetry-collector-contrib that referenced this pull request Oct 5, 2023
jmsnll pushed a commit to jmsnll/opentelemetry-collector-contrib that referenced this pull request Nov 12, 2023
jmsnll pushed a commit to jmsnll/opentelemetry-collector-contrib that referenced this pull request Nov 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants