-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: Add sampling rate to distributed tracing #13248
Conversation
server/embed/config_tracing.go
Outdated
|
||
const maxSamplingRatePerMillion = 1000000 | ||
|
||
func (e *Etcd) setupTracingExporter(ctx context.Context) (exporter tracesdk.SpanExporter, options []otelgrpc.Option, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can change this to method that takes ServerConfig as argument. This function doesn't need anything from embed.Etcd than e.cfg. As so we should try to keep is separate from embed.Etcd Law of Demeter and make them separate.
Best even move it to separate package server/embed/tracing
this way we can avoid this function reading embed.Etcd private fields and functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed the pattern we have with logging here, where we have config_logging.go. But will change that takes config instead, good suggestion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, good point. I also prefer consistent quality, instead of having each module written differently.
But this is a good candiate for a follow up PR. I would be happy to help with that.
server/embed/config_tracing.go
Outdated
if samplingRate == 0 { | ||
return sampler, nil | ||
} | ||
if samplingRate < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like flag value validation, please move it to embed.Config.Validate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion, for both this and the below check right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. Best would be to have a separate configuration struct for tracing that is included in embed.Config. This way we can have a validate function in tracing package that is invoked by embed.Config.Validate (common pattern in K8s code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah agreed its bit different per configuration here. The cfg.setupLogging seems to be called in validate as well. But agreed we can do it down the road in similar way, especially since config is really growing and lacks testing IMO.
@serathius let me know what you think, I addressed the suggestions, thanks. Will move some of unit tests cases if this is the approach you thought of. |
server/embed/config_tracing.go
Outdated
return exporter, options, err | ||
} | ||
|
||
func determineSampler(samplingRate int) (tracesdk.Sampler, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This always returns nil for error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch, removed error returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good aside one small nit.
Thanks, separated the unit tests for new function and fixed the nit. Please have another look, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few optional notes, otherwise LGTM
4214fc8
to
88661d1
Compare
@hexfusion addressed your suggestions, please have another look, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one very minor nit then lets merge 👍
ExperimentalDistributedTracingSamplingRatePerMillion is the number of samples to collect per million spans. Defaults to 0.
88661d1
to
810f489
Compare
Proxy failure appears to be a flake and passed on the previous run (txt change diff) |
In the initial PR we mentioned we should also add sampling rate, as this will allow users to control the overhead by being able to reduce the amount of traces sent.
This PR adds
ExperimentalDistributedTracingSamplingRatePerMillion
, which is the number of samples to collect per million spans.It defaults to 0, I wasn't sure if we should default to sample all instead of 0, but this is how the Kubernetes implementation did it so I tried to follow the same principles as there.