-
Notifications
You must be signed in to change notification settings - Fork 335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix overridden Jaeger span duration #195
Conversation
CLA Assistant Lite bot All contributors have signed the CLA ✍️ ✅ |
I have read the CLA Document and I hereby sign the CLA |
Added the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some nits; thanks for fixing this!
src/workerd/io/trace.c++
Outdated
@@ -253,9 +253,10 @@ Trace::Exception::Exception(rpc::Trace::Exception::Reader reader) | |||
name(kj::str(reader.getName())), | |||
message(kj::str(reader.getMessage())) {} | |||
|
|||
Tracer::Span::Span(kj::Own<Tracer> tracer, kj::Maybe<Jaeger::SpanData> spanData) | |||
Tracer::Span::Span(kj::Own<Tracer> tracer, kj::Maybe<Jaeger::SpanData> spanData, | |||
kj::TimePoint durationStartTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think we typically use 4-space indentation for line continuations, even in constructors (which looks a bit funny since it aligns with the :
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, I think I copy-pasted the definition from the header and neglected to tidy up the spacing.
: tracer(kj::mv(tracer)), spanData(kj::mv(spanData)), | ||
durationStartTime(kj::systemPreciseMonotonicClock().now()) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why this doesn't need/have an isPredictableModeForTest()
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I wondered about that, too... and probably wrote the original code. :I It's because the measurement is only used for calculating the duration, and the duration calculation itself is covered by isPredictableModeForTest()
.
I guess it wouldn't hurt much to just use isPredictableModeForTest()
everywhere, or define a common function that does that. We already have several places that are defining something like getTimeForJaegerSpan()
locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, a comment would probably suffice.
Edit: Probably doesn't matter.
@@ -494,7 +494,8 @@ Tracer::Span Tracer::makeSpan(kj::StringPtr operationName, | |||
} | |||
kj::TimePoint durationStartTime = kj::origin<kj::TimePoint>(); | |||
if (parentSpanContext != nullptr || overrideParent != nullptr) { | |||
durationStartTime = kj::systemPreciseMonotonicClock().now(); | |||
auto delta = kj::systemPreciseCalendarClock().now() - overrideStartTime; | |||
durationStartTime = kj::systemPreciseMonotonicClock().now() - delta; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be good to have an explanatory comment and/or a TODO to require both start time overrides in the future.
7f95471
to
51ede46
Compare
51ede46
to
113d1bf
Compare
Fixed indentation, added TODO for duration override, added internal PR, and rebased. |
I'm not sure what the overhead of clock reads is, but presumably it would be good to avoid the overhead when not tracing, which should be the common case.
… time Fixes incorrect durations seen on overridden spans.
113d1bf
to
750864b
Compare
When starting to record a Jaeger span, we do two different clock reads -- we read the calendar clock for the timestamp, and we read from the monotonic clock for the start of the span duration. We also allow callers to override the timestamp, to record a span that started earlier. But this override was only affecting the timestamp, not the start of the span duration, so durations for overridden spans were incorrect.
Ideally, we'd want callers to provide both the monotonic and calendar clock measurements for their span overrides, but as a stopgap we can compute the elapsed calendar time at the time of the override and subtract it from the current monotonic clock time, to estimate the intended duration start time.
(The official Jaeger C++ library has a more complex system, allowing users to specify both clock overrides, or just one or the other -- in which case the missing value is estimated using a similar calculation:
https://github.com/jaegertracing/jaeger-client-cpp/blob/e56b1bdb/src/jaegertracing/Tracer.cpp#L41
https://github.com/opentracing/opentracing-cpp/blob/06b57f48/include/opentracing/util.h#L40-L63
It appears the rust opentelemetry library doesn't have this problem, since it just uses monotonic clocks for everything... which presumably has a different set of problems.)