Standard tag for service name #77

StephenWithPH · 2017-06-17T00:20:46Z

@tedsuo and @yurishkuro: this is a continuation of the conversation from #75.

This also has overlay with #58 from @ruinanchen

Referring back to that issue...several of the existing OT-compatible tracers make service name a first-class item. See:

https://github.com/openzipkin/zipkin-go-opentracing/blob/master/zipkin-recorder.go#L71
https://github.com/uber/jaeger-client-go/blob/master/tracer.go#L71
https://github.com/instana/golang-sensor/blob/master/options.go#L6
https://github.com/hawkular/hawkular-apm-opentracing-javascript/blob/master/lib/deployment-meta-data.js#L28

I believe guidance on a standard tag for service name would be useful. Yes, this need is largely driven by UI considerations, but "which service?" is a valid question.

Quoting @tedsuo

We set the lightstep.component tag to mean "service/tracer", which was then poached by the official component tag, which means something else. So we would also like an official tag for this. Otherwise we will change our tag to service only to have that get poached to mean something else.

Quoting @yurishkuro

We use the term "service name" to describe the process (aka microservice) that uses an instance of a tracer. It's not a span-level information, but process-level, so there is no need for a span tag, we capture it as a "tracer-level tag" if you will.

In our case, our "tracer" is just raw spans to stdout. That's why we want the span itself to reflect the service. We could certainly append something in the process that picks up stdout and ships, but we feel it's cleaner to just add a tag on the span.

Thoughts?

yurishkuro · 2017-06-17T15:14:19Z

@StephenWithPH

In our case, our "tracer" is just raw spans to stdout.

If you're implementing an OpenTracing API, you still have a Tracer object, don't you? So you can record the service name in that object and when instrumentation calls tracer.StartSpan() you can attach that as an attribute of the span.

The main point of OpenTracing is that people can reuse instrumentation of various frameworks, we already have plenty of those in opentracing-contrib. How would all those frameworks know what your service name is when they create new spans?

StephenWithPH · 2017-06-19T20:01:19Z

@yurishkuro "you can attach that as an attribute of the span"... that's exactly what we're doing.

I wanted to (re)open the conversation about a standard, suggested tag for this. I believe @tedsuo expressed similar interest.

Thoughts on whether or not you'd like to see a PR to update the documentation with a standard tag (service) for this?

yurishkuro · 2017-06-19T20:50:47Z

Well, the standard tags exist for the purpose of letting instrumentation know which tags to use for which semantic data elements. In your case the tag is added by your tracer implementation, not instrumentation, so it does not need any standardization, you can name it anything you want, as long as your tracer and the tracing backend agree on what it is. Am I missing something?

StephenWithPH · 2017-06-19T22:48:38Z

Nope, not missing anything. This is more of a "people (n >= 3) keep asking for it because it seems like it should be there."

I'll wait for any input from @tedsuo; otherwise, I'll close this in a few days.

yurishkuro · 2017-06-19T23:02:03Z

We just need a clear use case that affects vendor-neutral instrumentation.

tedsuo · 2017-06-20T01:26:00Z

@yurishkuro we have some cases where instrumentations want to override the service tag we are setting by default, especially proxies and sidecar services that are reporting on behalf-of/multiplexing other services. Right now these instrumentations must override lightstep.component_name to change this, which doesn't feel vendor-neutral, it feels like a leak. I would prefer they do the override with a standard tag like service instead, and hopefully other tracers will choose support this behavior. Does that make sense?

tedsuo · 2017-06-20T01:41:47Z

To be clear: if tracers don't support the service tag as an override for their service-name concept, then fine. The instrumentation/tagging is still useful to users of that tracer, they can still do searches/filters using the service tag. The issue is if tracers want to support the overriding of the tracer/process-level service-name concept without a vendor-specific tag.

So right now, the problem is:
a) Telling instrumentors to use lightstep.component_name is weird.
b) If we instead use the service tag to mean lightstep.component_name without standardization, we risk having the meaning of service be defined later to be something else, which risks instrumentations setting it for that purpose and creating unintended behavior. This has already happened to us with the component tag and we are concerned about it happening again.

yurishkuro · 2017-06-20T02:06:16Z

@tedsuo to make sure I understand: a single process (e.g. a routing proxy like Envoy) creates spans on behalf of many different services, via a single instance of the Tracer, therefore you want the instrumentation to tell the Tracer on a per-Span basic the service name it represents.

If that's correct, then it's a much large scope than simply defining a standard tag, it's also prescribing new behavior to the tracers. For example, if you use Jaeger and do span.SetTag("service", "whatever"), it will have no impact on the service name reported in Jaeger backend, because per-Span service name is simply not supported (the tag will be stored on the span, but not interpreted). It wouldn't be too difficult for us to support it, but we should be clear about the scope of this change.

tedsuo · 2017-06-20T03:08:41Z

Yes, that exactly it! Only I think there's no problem if Jaeger does not support it. You can still search by tag:service="myService" in Jaeger or Zipkin and gain insight with these instrumentations. So it's opt-in. But it would be a problem if Jaeger and other tracers started interpreting service to mean something else. That's why we would like to reserve it's meaning.

Specifically, we want to reserve the tag service to mean a "standalone" service, which by default matches the "tracer name" or equivalent field set on global tracer initialization. It's the right complement to the component tag that exists to identify packages, frameworks and modules such as gRPC and JDBI that are incorporated into services.

yurishkuro · 2017-06-20T05:03:51Z

I think there's no problem if Jaeger does not support it. You can still search ...

Yes, one can search by the tag, but all aggregations will be incorrect as the spans will be attributed to the service name given at Tracer construction, not according to the "service" tag.

I don't mind reserving the service tag with a description "reserved for future use". To define the tag with the actual semantics discussed above is a bit premature, imo, because we're saying it is a standard tag while no tracers actually support it yet.

jpkrohling · 2017-06-20T05:59:39Z

We had something similar for Hawkular APM: during the "report" phase, we'd check if the application has set the service name. If not, we'd derive it from a configuration option, or from an env var:

https://github.com/hawkular/hawkular-apm/blob/master/client/opentracing/src/main/java/org/hawkular/apm/client/opentracing/EnvironmentAwareTraceRecorder.java#L43

I do think it makes sense to allow specific spans to set their service names, specially if they are dealing with out-of-band data or processing (like parts of a batch, queued items, ...).

tedsuo · 2017-06-20T19:15:24Z

Thanks @jpkrohling.

@yurishkuro maybe we are still lacking some process around standardizing tags. I don't know what it means to reserve a tag but not define it's meaning. Are we missing an incubation step? It feels kind of chicken and egg: either we provide a vendor-neutral tag like service for tracers to support multiplexing and changing the service name, or each tracer has to make up a vendor-specific solution. The second is already happening, it's what I'm trying to get in front of right now.

I agree that we should should be slow to add standard tags, while avoiding overly narrow solutions and non-existent problems. But, I believe service qualifies for standardization:

The concept of a standalone service is nearly universal.
Virtually every tracer supports a "tracer name" that maps to this concept.
Multiplexed/out-of-band reporting is a real, established pattern, which needs a facility to identify which service it is reporting for.

To solve this issue without vendor-specific solutions, a service tag can be introduced with the following properties:

a) the service tag defines a span to be the start of a standalone service, rather than a component or library. Libraries and shared components should not set this tag unless they are multiplexing and reporting on behalf of other services.

b) if a tracer provides a "tracer name," "service name," or similar concept, and would like to allow OT instrumentation to override it, the service tag is recommended for that facility.

@yurishkuro, is your preference to standardize on a), and "incubate" b)?

yurishkuro · 2017-06-20T21:00:56Z

maybe we are still lacking some process around standardizing tags.

Indeed. Partially because until now support for tags was entirely optional for tracer implementations, in the sense that almost none of them required any special behavior of the tracer aside from simply recording the tag on the span. The two minor exceptions are span.kind=server (only critical for Zipkin's single-span-per-RPC model) and sampling.priority (only relevant to tracers that use consistent sampling). The impact of the "service" tag, on the other hand, is very major. The ability of a single tracer instance to represent multiple services wasn't a use case that we thought of when developing OT API; if we did , then it probably would've been a part of the API itself. For example, it's quite difficult to make any non-trivial upfront sampling decision if you don't even know the service name when starting the span.

So what I am really trying to do is to take into account (intensive) past criticism of introducing features into the API before they are widely supported by the existing implementations. If we define the tag with a) and/or b) definition, it sets the expectation that such behavior should be generally supported by many implementations, which in case of Jaeger is a non-trivial amount of work across multiple languages.

So I am not sure how to answer your question. I do think we need some incubation process for the new tags of such critical impact. One possible way to do that is to say "It's incubating with a) and b) semantics, but the semantics might change prior to graduation".

beberlei · 2017-07-07T16:18:32Z

Question about service, if you have multiple MySQL databases, would service be service=mysql or would it be service=mysql://cluster or dbname

tedsuo · 2017-07-14T19:11:53Z

@yurishkuro I think you are right that we should develop the usecase further. I disagree that this tag would force anyone to change or be "broken" in regard to OT: your tracer will work fine, there is just nothing special about the service tag. I'm interested in getting out-of-band reporting to work via a tag convention specifically because it would allow us to add this concept to OT without making breaking changes, such as a new StartSpanOption or other changes that would literally break the API for tracers, regardless of whether you support this usecase.

So really, this is about incubating an entire concept: out of band reporting. I will think more about how we should do this: working groups, etc, so that when things are finally standardized they have been reviewed properly. You may well be right that there is interest in supporting out-of-band reporting in the OT community, but it requires deeper changes that need review and new API surface area.

In the meantime... we're going to start setting the service tag to mean this in Envoy, NGINX, Vanish, and other load balancers, to try this out. This gets that usecase unblocked and in a state where other tracers can choose to adopt or experiment with it. And I am going to try to prevent the term "Service" from being associated with a different concept in OT so that tracers which respond to the service tag in this manner will not run into trouble, or create confusion. If we choose a different, more official mechanism later we will switch from tags to that. But I would still like this concept to be named "Service".

@beberlei I think service naming conventions are user preference, based on how you think of your system. If you were going to draw a boxes and arrows diagram of your system, "service" would be the name your wrote on each box. So for systems that are more complicated "mysql-db" may not be enough. In general, db services I have seen tend to be named things like auth-db and image-cache.

As far as reserving the name "Service" for this concept in the OT universe: Tracers in general have a concept of "service name," often set on tracer initialization, that is a very important part of how they index things. What appears to be shaping up in OT is that "operations" exist in a "component" namespace, and components exist in a service namespace. So the "fully qualified span name" is often something like service:operation or service:component:operation. It's not too interesting to compare /user/account operations that are part of different services. So you end up wanting to look at app:/user/account in order to eliminate noise. I feel like this is fundamental and all tracers must deal with it somehow; in practice it's not possible to disambiguate all operation names in a distributed system without a namespace like this. But half of this concept lives outside OT at the moment. We should at least name it.

tylerbenson · 2017-07-17T03:12:31Z

At Datadog we have internally added the some tags, but are seeing how this issue resolves to influence further decision.
The tag names we're urgently using are "span-type", "service-name", and "resource-name". Type being web, db, cache, etc. Service name is obviously synonymous with "service" here. Resource name is provided when an additional level of grouping makes sense like table name or controller.

I guess my point is that I support better standardization of tags that would support finer grained grouping.

StephenWithPH · 2017-07-17T17:11:13Z

@tedsuo's entire paragraph from #77 (comment) ...

As far as reserving the name "Service" for this concept...

... perfectly summarizes what I was trying to articulate earlier on in this discussion.

yurishkuro · 2017-07-17T17:44:48Z

Is there anything blocking this from moving forward?

Does the following plan make sense?

add service tag
mark it as "incubating"; explain elsewhere that "incubating" means the meaning of the flag might (although unlikely to) change in the future if the current definition doesn't work out
describe it as a way to generate spans for different services from a single Tracer instance

wu-sheng · 2017-07-18T01:49:56Z

How to define the diff between peer.service and service tag?

This question is similar with @beberlei . According to @tedsuo 's explanation:

@beberlei I think service naming conventions are user preference, based on how you think of your system. If you were going to draw a boxes and arrows diagram of your system, "service" would be the name your wrote on each box. So for systems that are more complicated "mysql-db" may not be enough. In general, db services I have seen tend to be named things like auth-db and image-cache.

It it up to user, if so, hard to tell the difference.

yurishkuro · 2017-07-18T02:42:38Z

"peer service" means "the other service". I don't think its confusing. Both tags refer to the same domain of values - the names of the services in the architecture.

wu-sheng · 2017-07-18T02:44:34Z

IMO, if a span represents a client for calling remote service, the service tag for this span is also peer.service.

mabn · 2017-07-18T10:57:10Z

@wu-sheng When A calls B and they report spans on both sides of the RPC then:

A reports (the client-side of the RPC):

service: A
peer.service: B (because A called B)

B reports (the server-side of the RPC):

service: B
peer.service: A (because it was called by A)

peer.service may be unknown

wu-sheng · 2017-07-18T13:56:13Z

@mabn If A is just a client, I didn't think A will set itself as a service tag, so as B.

IMO, A and b are a pair for this RPC call, they share the same service name. e.g. /prod/order service name is both correct for http client and server sides. Btw maybe operation names are different, like apache/httpclient/post/order at client, tomcat/http/post/order at server.

tedsuo · 2017-07-19T17:00:05Z

@wu-sheng I know, this is hard, right? There are only so many words, we are going to use them up quickly. :)

In this case I agree with @mabn, service and peer.service are different. I understand what you are saying about the client being an embedded part of the peer service, and so that is the name of it's service, but I think that is the purpose of peer.service - to allow the client a place to put that information. So client code sets the peer.service tag to the target service, and out-of-band reporting code would set service as the reporting service. RPC client code (and application code in general) should never set the service tag - either it is set implicitly on the tracer, or explicitly by whatever mechanism is trying report on behalf of multiple services, or some other special case where you have multiple services but one tracer.

I'm glad to see that @StephenWithPH @tylerbenson and others have a similar concept for service. @wu-sheng are you satisfied with this reasoning enough to try it out? If so I will make a PR along the lines of what @yurishkuro has suggested.

wu-sheng · 2017-07-20T01:16:21Z

RPC client code (and application code in general) should never set the service tag - either it is set implicitly on the tracer, or explicitly by whatever mechanism is trying report on behalf of multiple services, or some other special case where you have multiple services but one tracer.

If we have service and peer.service at the same time, we really should add explicit usages for them, like you said. @tedsuo Otherwise, it hard to use and support.

peer.service : Client side only. Remote service name (for some unspecified definition of "service"). E.g., "elasticsearch", "a_custom_microservice", "memcache"
service : The server side provided. E.g. http:/prod/order/ as a HTTP service.

tedsuo · 2017-07-21T16:01:11Z

Thanks @wu-sheng, I'll try to make it clear.

BTW @tylerbenson, your tag service-name matches service, and span-type sounds like it matches component. But there is not currently an OT equivalent of resource-name. Possibly this is because most of the tags to date come from instrumenting libraries, which tend to have a narrow focus. I can see service:component:resource:operation being useful in large applications and frameworks; I wonder if that's what people are searching for in this issue: #72.

yurishkuro · 2018-04-11T18:05:44Z

I was thinking more about this recently. In Jaeger we identify the service (originator of the span) via a Process object that contains not only service name, but also a collection of key/value tags that typically represent other metadata about the service, such as host/ip where the service is running, maybe software version, zone/datacenter, deployment group, etc. Simply setting the service name via tag doesn't allow for expressing this level of service identity & metadata. It's also going to be quite inefficient to do for every span since the tracer internally pre-processes the service metadata into a format ready to be sent over the wire (or even communicating it to the backend upon establishing the connection). So it seems like instantiating multiple tracers in the proxy service for this scenario would be a better approach.

tylerbenson · 2018-05-29T06:42:29Z

@yurishkuro in our case, we only set the service name tag on the top span of each name, then it implicitly cascades (until it is manually changed to something else). In my opinion, is much easier to deal with than managing multiple different tracer instances.

yurishkuro · 2018-05-29T23:30:06Z

I don't have strong objection to introducing this tag.

tylerbenson · 2018-05-30T01:21:36Z

@yurishkuro @tedsuo Should I just submit a PR, or is there a different process for this?

yurishkuro · 2018-05-30T03:04:12Z

I think a PR is fine. As it's adding a new tag it doesn't need to go through the full RFC cycle. The PR is just to add the tag to the data conventions. It's description should spell out what the tracers are expected to do with it if they support it.

Per discussion in #77, service name is deemed a widely enough used concept to warrant adoption by the community. (Closes #77)

StephenWithPH mentioned this issue Jun 21, 2017

Need serviceName as top-level field to distinguish among different services Nordstrom/ctrace#4

Closed

yurishkuro mentioned this issue Aug 12, 2017

Build a Span with a given SpanContext #81

Open

yurishkuro mentioned this issue Apr 11, 2018

Better support for multiple tracers/services jaegertracing/jaeger-client-python#149

Closed

tylerbenson added a commit that referenced this issue May 30, 2018

Add tag for service

e7d3740

Per discussion in #77, service name is deemed a widely enough used concept to warrant adoption by the community. (Closes #77)

tylerbenson mentioned this issue May 30, 2018

Add tag for service #119

Merged

tedsuo closed this as completed in #119 Jul 12, 2018

pavolloffay mentioned this issue Jul 13, 2018

Add service tag opentracing/opentracing-java#287

Merged

yurishkuro mentioned this issue Jul 14, 2018

Support "service" tag jaegertracing/jaeger#936

Closed

ab-pm mentioned this issue Nov 10, 2020

Use dd-trace to capture traces and send to jaeger DataDog/dd-trace-js#694

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standard tag for service name #77

Standard tag for service name #77

StephenWithPH commented Jun 17, 2017

yurishkuro commented Jun 17, 2017

StephenWithPH commented Jun 19, 2017

yurishkuro commented Jun 19, 2017

StephenWithPH commented Jun 19, 2017

yurishkuro commented Jun 19, 2017

tedsuo commented Jun 20, 2017 •

edited

Loading

tedsuo commented Jun 20, 2017 •

edited

Loading

yurishkuro commented Jun 20, 2017

tedsuo commented Jun 20, 2017

yurishkuro commented Jun 20, 2017

jpkrohling commented Jun 20, 2017

tedsuo commented Jun 20, 2017

yurishkuro commented Jun 20, 2017

beberlei commented Jul 7, 2017

tedsuo commented Jul 14, 2017

tylerbenson commented Jul 17, 2017

StephenWithPH commented Jul 17, 2017

yurishkuro commented Jul 17, 2017

wu-sheng commented Jul 18, 2017

yurishkuro commented Jul 18, 2017

wu-sheng commented Jul 18, 2017

mabn commented Jul 18, 2017 •

edited

Loading

wu-sheng commented Jul 18, 2017

tedsuo commented Jul 19, 2017

wu-sheng commented Jul 20, 2017 •

edited

Loading

tedsuo commented Jul 21, 2017 •

edited

Loading

yurishkuro commented Apr 11, 2018

tylerbenson commented May 29, 2018

yurishkuro commented May 29, 2018

tylerbenson commented May 30, 2018

yurishkuro commented May 30, 2018

Standard tag for service name #77

Standard tag for service name #77

Comments

StephenWithPH commented Jun 17, 2017

yurishkuro commented Jun 17, 2017

StephenWithPH commented Jun 19, 2017

yurishkuro commented Jun 19, 2017

StephenWithPH commented Jun 19, 2017

yurishkuro commented Jun 19, 2017

tedsuo commented Jun 20, 2017 • edited Loading

tedsuo commented Jun 20, 2017 • edited Loading

yurishkuro commented Jun 20, 2017

tedsuo commented Jun 20, 2017

yurishkuro commented Jun 20, 2017

jpkrohling commented Jun 20, 2017

tedsuo commented Jun 20, 2017

yurishkuro commented Jun 20, 2017

beberlei commented Jul 7, 2017

tedsuo commented Jul 14, 2017

tylerbenson commented Jul 17, 2017

StephenWithPH commented Jul 17, 2017

yurishkuro commented Jul 17, 2017

wu-sheng commented Jul 18, 2017

yurishkuro commented Jul 18, 2017

wu-sheng commented Jul 18, 2017

mabn commented Jul 18, 2017 • edited Loading

wu-sheng commented Jul 18, 2017

tedsuo commented Jul 19, 2017

wu-sheng commented Jul 20, 2017 • edited Loading

tedsuo commented Jul 21, 2017 • edited Loading

yurishkuro commented Apr 11, 2018

tylerbenson commented May 29, 2018

yurishkuro commented May 29, 2018

tylerbenson commented May 30, 2018

yurishkuro commented May 30, 2018

tedsuo commented Jun 20, 2017 •

edited

Loading

tedsuo commented Jun 20, 2017 •

edited

Loading

mabn commented Jul 18, 2017 •

edited

Loading

wu-sheng commented Jul 20, 2017 •

edited

Loading

tedsuo commented Jul 21, 2017 •

edited

Loading