Skip to content
This repository has been archived by the owner on May 23, 2023. It is now read-only.

Opt-in vs Opt-out & security on propagation at the system boundaries #68

Open
cwe1ss opened this issue May 1, 2017 · 3 comments
Open

Comments

@cwe1ss
Copy link
Member

cwe1ss commented May 1, 2017

What's the recommended practice for the following scenarios:

  • someone makes a request to my public webserver and includes fake trace headers. It might even contain fake baggage that would be propagated within my system. I can't just not implement extract-functionality because I might use an OpenTracing enabled load balancer in front of it etc.
  • an internal application sends a http request to some 3rd party system. How do I make sure I don't send my trace headers/baggage here?

In other words, who is responsible for making sure that headers are validated for its boundaries? AFAIK an application developer does not have a way to manipulate spans through the OpenTracing API. Also, hoping that a tracer might offer this functionality seems troublesome - especially when you want to switch tracers.

Or is this an additional out-of-scope hook that must be provided by the http library / web framework? E.g. they will manage their own whitelist/blacklist and just not call inject/extract if it matches.

@yurishkuro
Copy link
Member

Very good questions. Not sure if there can be one size fits all answers.

My feeling is that if the system has a known entry point for external traffic and is certain that it doesn't want to accept tracing/baggage info at those points, then that layer can be configured with a tracer with noop-extractor. If internal services make calls back into the same layer, that's probably a design issue and the internal/external endpoints should be separated.

But there is an even more complex case where sometimes you do want to accept tracing info from external requests, for example from your own mobile apps. In this case I would rather have the whole header handling centralized so that only requests with valid auth are allowed to keep the headers. This makes the whitelisting of headers slightly more complex given that OpenTracing does not prescribe the wire format, but in practice it won't be too burdensome since a given tracing implementation can always document its headers format (and if we standardize on the wire format in the future, it makes it even easier).

Note that there is another dimension to the baggage headers - just because the request came from a trusted source doesn't always mean that source is allowed to set the baggage. For example, we are internally implementing a whitelist mechanism for baggage keys, to avoid both unsanctioned used of baggage and to control which services are allowed to set which baggage keys. One of the ways to solve this is by using signed baggage values.

Lastly, about the outbound requests, again I would expect the sanitizing of the headers to happen as some centralized place like a proxy that all services must use for externally facing requests, as otherwise it's almost impossible to enforce that all services do the right thing. The safest approach when it comes to data security is to deny-all, allow-some.

@bhs
Copy link
Contributor

bhs commented May 27, 2017

@cwe1ss the baggage mechanism is not intended to cross the trusted/untrusted boundary. Core tracing ids are also pretty difficult, as is the sampling bit... The workarounds I'm aware of generate ids and sampling decisions in the server on behalf of the client, then do some light math to verify that none of the above were tampered with (this is approximately what google did for client tracing IIRC).

My general sense, though, is that we should assume that peers are acting in good faith about tracing ids and that tracing systems should have circuit-breakers to prevent instrumentation overload, even if the tracing data quality degrades as a result.

@cwe1ss
Copy link
Member Author

cwe1ss commented Feb 1, 2018

Thank you for your responses and sorry for never replying - I have been off the grid for quite some time.

While I definitely agree that a central proxy / service mesh could/should solve this, I'm wondering if there should be guidance or other solutions for people who don't have such a complex system and just want to use OpenTracing for a simpler application.
Example: Let's say I have a small monolithic webshop that uses an external payment API for payment processing and I want to make sure my OT headers are not sent to that API.

Do you think this issue is still relevant in terms of the spec in your opinion? I don't want this issue to stay open if this has already been discussed in some way so feel free to close it in that case!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants