-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we ignore spans for specific operations? #814
Comments
Unfortunately, this isn't supported directly but there are ways around it. You could add a new preProcessor https://github.com/jaegertracing/jaeger/blob/master/cmd/collector/app/span_processor.go#L32 that filters those spans via some string matching. Alternative is to wait for adaptive sampling which will allow you to customize sampling rates per operation rather than per service. I'll try to get adaptive sampling OSS soon (maybe a month) |
The url |
@yurishkuro we are using an internal health check bundle with the customized path "/external_ping". When I change the health check to ping /health, it still creates a span (see image). Our services are Dropwizard, if that matters. The adaptive sampling per operation sounds like exactly what we want so we will just wait for that. Thanks |
So it's already possible to configure sampling strategy via static config for jaeger-query service and /health endpoint to be 0. But I think it's a roundabout way to go about it, we should simply fix the code to not enable tracing on /health in the first place. I am going to reopen this as an enhancement request. |
Wait so there's already a way to update our --sampling.strategies-file so that we can filter out a specific endpoint? Is there documentation for this? |
Correction: the static config is still per service level, but it shouldn't be hard to make it per operation. |
Hi, @yurishkuro @black-adder. I am trying to sample Jaeger at the collector level. I followed the information given in the Jaeger documentation https://www.jaegertracing.io/docs/1.18/sampling/#collector-sampling-configuration . containers:
- name: jaeger
image: 'jaegertracing/all-in-one:1.19.2'
args:
- '--query.ui-config=/etc/config/ui.json'
- '--sampling.strategies-file=/etc/jaeger/sampling/sampling.json' Below is my sampling.json file: {
"service_strategies": [
{
"service": "istio",
"type": "probabilistic",
"param": 0,
"operation_strategies": [
{
"operation": "postJson",
"type": "probabilistic",
"param": 0
},
{
"operation": "/api/topic/post/xml",
"type": "probabilistic",
"param": 0
}
]
},
{
"service": "bar",
"type": "ratelimiting",
"param": 5
}
],
"default_strategy": {
"type": "probabilistic",
"param": 0.5,
"operation_strategies": [
{
"operation": "/health",
"type": "probabilistic",
"param": 0.0
},
{
"operation": "/metrics",
"type": "probabilistic",
"param": 0.0
}
]
}
} I tried different ways to restrict the certain operation of service "istio" like:-
Below is how I am creating the Tracer Bean: @Bean
public io.opentracing.Tracer jaegerTracer() throws OGSGeneralException
{
}
final Configuration.SamplerConfiguration samplerConfig = Configuration.SamplerConfiguration.fromEnv()
.withType("const").withParam(1);
final Configuration.ReporterConfiguration reporterConfig = Configuration.ReporterConfiguration.fromEnv()
.withLogSpans(true);
final Configuration config = new Configuration(applicationName).withSampler(samplerConfig)
.withReporter(reporterConfig);
return config.getTracer();
} Could anyone please help me here to achieve adaptive sampling at the collector level? It is a very critical issue. |
@saivishalvangala there is no sampling at collector level. Sampling only happens in the SDKs. The sampling strategy configuration that you can pass to the collector is used to pass to SDKs only. I am not sure if Istio even supports Jaeger SDK - there were some attempts to link Jaeger C++ SDK, but I don't know what the current status is. Therefore, any configuration you provide to collectors will not have any effect if the traces are started by Istio. |
Hi @yurishkuro thank you for your reply. Actually "istio" is name of micro-service which I configured, nothing related to istio-service mesh. |
That sampling file is used by the SDKs when the "remote" sampling strategy is used. It allows admins to control centrally the strategy for all clients at once. |
Just adding to Yuri and Juraci's comments:
From
https://www.jaegertracing.io/docs/latest/sampling/#client-sampling-configuration
:
*Remote* (sampler.type=remote, which is also the default) sampler
consults Jaeger agent for the appropriate sampling strategy to use in the
current service. This allows controlling the sampling strategies in the
services from a central configuration in Jaeger backend, or even
dynamically (see Adaptive Sampling).
This means that, with the right setup, when the "istio" service creates a
tracer, it will start polling the Jaeger agent for sampling strategies,
which the agent, in turn, obtains from the collector.
A few troubleshooting questions that come to mind are:
- Is there a jaeger-agent sidecar running on the same host as the
"istio" service?
- Has the "istio" service been explicitly configured with sampling
configuration? It should default to "remote" if not set.
- Can you see metrics from this agent that indicate "istio" has been
querying for sampling config?
- From jaeger-agent, they should resemble:
jaeger_agent_collector_proxy_total{endpoint="sampling",
job="jaeger-agent",
result="ok"}
- If "istio" is configured to emit metrics, you should see metrics
like: jaeger_tracer_sampler_queries_total{job="istio", result="ok"}
Hope that helps.
Albert
…On Fri, Oct 2, 2020 at 5:50 PM Juraci Paixão Kröhling < ***@***.***> wrote:
That sampling file is used by the SDKs when the "remote" sampling strategy
is used. It allows admins to control centrally the strategy for all clients
at once.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#814 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGK2LHTGHI36BUDDXNDW2FLSIWA3HANCNFSM4E7JAMDQ>
.
|
Hi @albertteoh , Thanks for explaining in such a detailed way.
I think those logs will be from jaeger-agent side car. Am I right? |
there is no Jaeger agent running as side-car on the same host as
micro-service "istio".
It looks like "istio" is sending spans directly to jaeger-collector, which
is possible but not the recommended deployment pattern. Please see:
https://www.jaegertracing.io/docs/1.20/faq/#do-i-need-to-run-jaeger-agent.
With jaeger-agent as a sidecar with "istio", you should be able to fetch
the sampling strategies from jaeger-collector successfully. This might be
useful:
https://www.jaegertracing.io/docs/1.19/operator/#auto-injecting-jaeger-agent-sidecars
.
If you *really do not *want to run an agent sidecar, setting
the JAEGER_SAMPLING_ENDPOINT to the collector's URL in your K8s setup
should work. For example, if I'm running my app and jaeger-collector
locally, it would be "http://localhost:14268/api/sampling".
If you're curious to understand how the remote sampling works under the
hood (which I recommend!), I suggest running jaeger-collector (and
jaeger-agent) locally and having a simple application emit some spans.
Tracegen <https://github.com/jaegertracing/jaeger/tree/master/cmd/tracegen> is
one such example application that you can play around with. You might need
to make some config changes in code to get remote sampling working; which I
can help with if you decide to try this out.
I am not understanding where to observe these logs, because Jaeger
instance is running in a pod and that has 4 services :- agent, collector,
collector-headless, query.
I was referring to the metrics, not the logs. The examples were Prometheus
queries:
jaeger_agent_collector_proxy_total{endpoint="sampling",
job="jaeger-agent", result="ok"}
jaeger_tracer_sampler_queries_total{job="istio", result="ok"}
Albert
…On Tue, Oct 6, 2020 at 12:35 AM saivishalvangala ***@***.***> wrote:
Hi @albertteoh <https://github.com/albertteoh> , Thanks for explaining in
such a detailed way.
Answers for your queries:
1. there is no Jaeger agent running as side-car on the same host as
micro-service "istio".
2. Yeah Tracer bean is configured with sampling type " remote".
3. I am not understanding where to observe these logs, because Jaeger
instance is running in a pod and that has 4 services :- agent, collector,
collector-headless, query.
[image: image]
<https://user-images.githubusercontent.com/55178712/95079837-cb927900-0734-11eb-9b19-edd3cbeb5082.png>
[image: image]
<https://user-images.githubusercontent.com/55178712/95079797-be758a00-0734-11eb-8dc8-e40ea9748f11.png>
I think those logs will be from jaeger-agent side car. Am I right?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#814 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AGK2LHQO6MDZSF7FAXG42XLSJHDRFANCNFSM4E7JAMDQ>
.
|
Hi @albertteoh thank you for such valuable information. Firstly I will tell you the way which I tried:
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: '1'
sidecar.jaegertracing.io/inject: with-sampling "with-sampling" instance name of jaeger deployed in "jaeger-operator" namespace.
spec:
containers:
- env:
- name: JAEGER_ENDPOINT
value: 'http://with-sampling-collector.jaeger-operator:14268/api/traces'
{
"service_strategies": [
{
"service": "istio",
"type": "probabilistic",
"param": 0.0,
"operation_strategies": [
{
"operation": "/istio-arch-type/kafka/topic/post/json",
"type": "probabilistic",
"param": 0.0
},
{
"operation": "/istio-arch-type/kafka/topic/post/xml",
"type": "probabilistic",
"param": 0.0
}
]
},
{
"service": "example-app",
"type": "probabilistic",
"param": 0.8,
"operation_strategies": [
{
"operation": "getHi",
"type": "probabilistic",
"param": 0.0
}
]
}
],
"default_strategy": {
"type": "probabilistic",
"param": 0.5,
"operation_strategies": [
{
"operation": "/health",
"type": "probabilistic",
"param": 0.0
},
{
"operation": "/metrics",
"type": "probabilistic",
"param": 0.0
}
]
}
}
@Bean
public io.opentracing.Tracer jaegerTracer()
{
final Configuration.SamplerConfiguration samplerConfig = Configuration.SamplerConfiguration.fromEnv()
.withType("remote");
final Configuration.ReporterConfiguration reporterConfig = Configuration.ReporterConfiguration.fromEnv()
.withLogSpans(true);
final Configuration config = new Configuration("example-app").withSampler(samplerConfig)
.withReporter(reporterConfig);
return config.getTracer();
} but traces are not coming up in JaegerUI. I have few observations and doubts in my configurations:
Could you please correct me if I gave any configurations wrongly?.
|
Hi @albertteoh . |
You shouldn't need
Yes, that looks fine to me. Short of viewing the metrics I mentioned above, maybe you can try running |
I tried both running in local and deployment in kubernetes cluster, but no luck. Remote config is not being fetched from jaeger-agent. |
Okay, the next questions that come to mind are:
You mentioned you tried running this stack locally. Can you:
|
Hi @albertteoh , One good news is that now a jaeger-agent is injected as a sidecar into the pod of application and the application is able to fetch the default sampling strategy from the "sampling.json" file configured in the collector. Thanks for your continuous help to achieve this. {
"service_strategies": [
{
"service": "example-app.jaeger-operator",
"type": "probabilistic",
"param": 0,
"operation_strategies": [{
"operation": "getHi",
"type": "probabilistic",
"param": 0
}]
}
]
} I gave only service_strategies in sampling.json. As per the above "sampling.json", all endpoints of example-app-jaeger should not be traced, but all the service is picking deafault_strategy and all the traces are coming up in Jaeger-UI. I tried below service names:
Configuration.SamplerConfiguration samplerConfig=Configuration.SamplerConfiguration.fromEnv().withType("remote");
final Configuration.ReporterConfiguration reporterConfig = Configuration.ReporterConfiguration.fromEnv().withLogSpans(true);
final Configuration config = new Configuration("example-app-jaeger").withSampler(samplerConfig)
.withReporter(reporterConfig);
This is the screenshot of trace in Jaeger for one endpoint of example-app-jaeger: This is the controller method: @GetMapping("/hello")
@TraceIt
public ResponseEntity<String> getHeloo()
{
return new ResponseEntity<>("Hi", HttpStatus.OK);
} Do you find any issue in the trials that I did? Could you help me in enabling service_strategies and operation_strategies to respective services? Tried different operation names and service names, but no luck Regards, |
A service name of "example-app-jaeger" with operation "getHi" looks correct to me. Curious, how many calls to "getHi" are you making and how many traces are you seeing (I know you're expecting 0 traces)? Reason for asking is, when running this locally with a simple Go app (tracegen), with 0 probability of sampling at both the service level and operation level, interestingly, I'm seeing at least 1 trace come through. The reason for this behaviour, when stepping through code is because the Probabilistic sampling strategy has a guaranteed lower-bound sampler that guarantees that a minimum rate is sampled. In jaeger-client-go, the per-operation sampler holds an instance of a RateLimiter that has an initial maxBalance of at least 1, even if the lowerBound is 0. After the first trace is emitted, the balance is reduced down to 0 and subsequent updates to add credit to the balance do nothing since This behaviour is unexpected to me; but maybe my test is flawed. I wonder if anyone in the community can confirm if this is expected behaviour of the lowerbound sampler or if it's a bug? i.e. I would expect no traces at all for tracegen::lets-go given the following
|
Hi @albertteoh, I made around 8 calls and I got 8 traces in Jaeger. If I observe sampler.type and sampler.param in the span of each trace, I am seeing service_strategy sampling configurations but not of operation_strategy. I have loaded the following sampling configurations into the collector: {
"service_strategies": [{
"service": "example-app-example-app",
"type": "probabilistic",
"param": 0.8,
"operation_strategies": [{
"operation": "getHi",
"type": "probabilistic",
"param": 0
}]
}]
} I have one important observation. {
"service_strategies": [{
"service": "example-app-example-app",
"type": "probabilistic",
"param": 0.8,
"operation_strategies": [{
"operation": "GET",
"type": "probabilistic",
"param": 0.9
}]
}]
} POST operations are sampling only if the operation name is given as "POST". This behavior is unexpected to me, but maybe my test is flawed. Anyone in the community could confirm if this behavior meets expectations or if it's a bug? Regards, |
I would verify that the strategy returned by the agent (curl the corresponding endpoint) for your service matches your expectation. What you describe about GET/POST is very odd behavior. Sampling does not look at http methods, it only looks at the root span name as a proxy for the endpoint name. So unless you're putting GET as span name (which didn't look like from your example), the strategy with GET operation should never match. |
Hi @yurishkuro, Even I was surprised by this behavior of sampling operation-wise. But I am pretty sure that this is how operation-wise sampling is working as I tested this twice thrice before posting here. GET/POST are considered as operation names of respective HTTP methods, but not actual operation name of the trace. You can refer to my previous comments and observations that I posted. Please correct me if my observations are wrong. Regards, |
@saivishalvangala what you're describing is physically impossible, Jaeger samplers do not even have access to tags on the span, they only have access to operation name. Consider providing a reliable reproducer if you want us to investigate. The example you linked (https://github.com/saivishalvangala/jaeger-sampling) does not build for me, and is using a |
Hi @yurishkuro, In the (https://github.com/saivishalvangala/jaeger-sampling) please use deployment.yaml which has sampling configurations as Regards, |
@saivishalvangala I don't need the deployment or a jar, I need to be able to compile and run the code from source, otherwise I cannot investigate. |
@yurishkuro please try to compile and run the code now. It will work for you. |
I am getting the same error:
|
Yes, we can do that with the help of tracing options,add the ignorepatterns in your trace.json and using aspnetcorediagnostics options filter these trace |
Hi, i try to use the sampling configuration base on jaeger documentation, but i still get sapn metrics on jaeger https://www.jaegertracing.io/docs/1.38/sampling/#CollectorSamplingConfiguration |
I have the same question/issue as @albertteoh . For operations that we've set to
we are still seeing traces being sampled. Is there a way to completely ignore/not sample spans of specific operations? |
We do health checks on all our services by having Marathon make a request to GET /external_ping every 5 seconds. These health checks appear as spans in Jaeger and it's quite noisy. Is there a way to avoid have the collector not store these?
The text was updated successfully, but these errors were encountered: