Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tracing: support span linking for spans shared by multiple traces #70864

Open
erikgrinaker opened this issue Sep 29, 2021 · 3 comments
Open

tracing: support span linking for spans shared by multiple traces #70864

erikgrinaker opened this issue Sep 29, 2021 · 3 comments
Labels
A-kv-observability C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team

Comments

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Sep 29, 2021

During query processing, we frequently wait for asynchronous operations that are shared by multiple callers. These operations are currently absent from traces. For example:

  • Raft processing: a single round of Raft can include log entries for many commands coming from many queries and clients.

  • Asynchronous transaction writes: most transaction writes are pipelined, i.e. sent off asynchronously.

  • Range descriptor lookups: if a descriptor lookup is already in flight, other callers will coalesce their lookups, i.e. wait for the result of the in-flight request.

In these cases (and probably many others), query tracing goes dark once it hits these operations, which is unfortunate -- especially in the case of Raft, where it would be very useful to follow e.g. the replication and fsync latencies of each query.

What we probably need to do here is to start a new span for these operations, and then return that span up to all callers who are waiting for the results of the operations, such that they can link the shared span into their own traces. Note that the callers may well arrive after the operation has started, as in the case of range descriptor lookup coalescing (i.e. a query can find an in-flight lookup that started 100ms ago, and wait for that to complete).

This may also require support for this in the trace data formats. @andreimatei mentioned in #70741 (review) that there is some support for this in OpenTelemetry, but that it's still somewhat in flux.

Jira issue: CRDB-10256

@erikgrinaker erikgrinaker added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-kv-observability T-kv KV Team labels Sep 29, 2021
@erikgrinaker
Copy link
Contributor Author

@ajwerner notes that there is some support for this in Raft already, but it requires an increased log level as it has some overhead:

// traceMessageSends records the provided event for all proposals contained in
// in entries contained in msgs. The vmodule level for raft must be at
// least 1.
func (r *Replica) traceMessageSends(msgs []raftpb.Message, event string) {
if log.V(1) || r.store.TestingKnobs().TraceAllRaftEvents {
var ids []kvserverbase.CmdIDKey
for _, m := range msgs {
ids = extractIDs(ids, m.Entries)
}
traceProposals(r, ids, event)
}
}

@erikgrinaker erikgrinaker added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) and removed C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. labels Oct 5, 2021
@erikgrinaker
Copy link
Contributor Author

@andreimatei says regarding the above Raft tracing:

re raft application tracing needing a high-vmodule setting - this might be as easy to fix as a change from if log.V(2) to if log.ExpensiveLogsEnabled(2); perhaps check if that's the case

Tried this, didn't seem to work: no Raft stuff in the session traces.

andreimatei added a commit to andreimatei/cockroach that referenced this issue Nov 15, 2021
Before this patch, the decoding and application of a Raft command was
not included in the recording of the request that generated the
respective command, even in the case where consensus is synchronous with
respect to the request (i.e. non-AsyncConsensus requests). This was
because, although we plumb tracing information below Raft, the span
under which Raft processing was occurring was Fork()ed from the parent
span (i.e. the request evaluation's span). The reasons why that was are
not good:

1) forking (as opposed to creating a regular child) was introduced in
   cockroachdb#39425. I'm not sure whether there was a particular reason for this
   decision. Perhaps there was fear at the time about the child
   outliving the parent - although generally it doesn't because,
   in the case of async consensus, we fork a parent which I think will
   outlive the child:
   https://github.com/cockroachdb/cockroach/blame/13669f9c9bd92a4c3b0378a558d7735f122c4e72/pkg/kv/kvserver/replica_raft.go#L157

   In case of sync consensus requests, it's possible for the Raft application
   span to outlive the evaluation span in cases when the application part
   (as opposed to the consensus part) can be done async (see
   CanAckBeforeApplication). Still, regardless of the exact details of the
   lifetimes, with time it has become clear that it's appropriate to create
   a child when it's expected that it will usually not outlive the parent
   even if, on occasion, it can outlive it.

2) forked spans used to be included in the parent's recording until cockroachdb#59815.
   This explains why we regressed here - Raft application used to be
   included in recordings properly.

This patch makes the application span be a regular child, and so the
raft application span is included in the request trace.

Touches cockroachdb#70864. That issue asks for a new tracing feature but I think
what motivated it is the application info missing from query traces.
This patch is sufficient for that.

Release note (general change): Tracing transaction commits now includes
details about replication.
andreimatei added a commit to andreimatei/cockroach that referenced this issue Nov 18, 2021
Before this patch, the decoding and application of a Raft command was
not included in the recording of the request that generated the
respective command, even in the case where consensus is synchronous with
respect to the request (i.e. non-AsyncConsensus requests). This was
because, although we plumb tracing information below Raft, the span
under which Raft processing was occurring was Fork()ed from the parent
span (i.e. the request evaluation's span). The reasons why that was are
not good:

1) forking (as opposed to creating a regular child) was introduced in
   cockroachdb#39425. I'm not sure whether there was a particular reason for this
   decision. Perhaps there was fear at the time about the child
   outliving the parent - although generally it doesn't because,
   in the case of async consensus, we fork a parent which I think will
   outlive the child:
   https://github.com/cockroachdb/cockroach/blame/13669f9c9bd92a4c3b0378a558d7735f122c4e72/pkg/kv/kvserver/replica_raft.go#L157

   In case of sync consensus requests, it's possible for the Raft application
   span to outlive the evaluation span in cases when the application part
   (as opposed to the consensus part) can be done async (see
   CanAckBeforeApplication). Still, regardless of the exact details of the
   lifetimes, with time it has become clear that it's appropriate to create
   a child when it's expected that it will usually not outlive the parent
   even if, on occasion, it can outlive it.

2) forked spans used to be included in the parent's recording until cockroachdb#59815.
   This explains why we regressed here - Raft application used to be
   included in recordings properly.

This patch makes the application span be a regular child, and so the
raft application span is included in the request trace.

Touches cockroachdb#70864. That issue asks for a new tracing feature but I think
what motivated it is the application info missing from query traces.
This patch is sufficient for that.

Release note (general change): Tracing transaction commits now includes
details about replication.
craig bot pushed a commit that referenced this issue Nov 18, 2021
72642: democluster: improve logging/tracing r=otan a=knz

All commits but the last 3 from #72644
(Reviewers: only review last 3 commits)

Informs #58938

72727: tracing: misc cleanups r=andreimatei a=andreimatei

See individual commits.

72738: kvserver: include Raft application on leaseholder in request trace  r=andreimatei a=andreimatei

Before this patch, the decoding and application of a Raft command was
not included in the recording of the request that generated the
respective command, even in the case where consensus is synchronous with
respect to the request (i.e. non-AsyncConsensus requests). This was
because, although we plumb tracing information below Raft, the span
under which Raft processing was occurring was Fork()ed from the parent
span (i.e. the request evaluation's span). The reasons why that was are
not good:

1) forking (as opposed to creating a regular child) was introduced in
   #39425. I'm not sure whether there was a particular reason for this
   decision. Perhaps there was fear at the time about the child
   outliving the parent - although I don't think that it does because,
   in the case of async consensus, we fork a parent which I think will
   outlive the child:
   https://github.com/cockroachdb/cockroach/blame/13669f9c9bd92a4c3b0378a558d7735f122c4e72/pkg/kv/kvserver/replica_raft.go#L157

   Regardless of the exact details of the lifetimes, with time it has become
   clear that it's appropriate to create a child when it's expected that
   it will usually not outlive the parent even if, on occasion, it can
   outlive it.

2) forked spans used to be included in the parent's recording until #59815.
   This explains why we regressed here - Raft application used to be
   included in recordings properly.

This patch makes the application span be a regular child, and so the
raft application span is included in the request trace.

Touches #70864. That issue asks for a new tracing feature but I think
what motivated it is the application info missing from query traces.
This patch is sufficient for that.

Release note (general change): Tracing transaction commits now includes
details about replication.

72896: server: don't mix Tracers in test tenant r=andreimatei a=andreimatei

Before this patch, TestServer.StartTenant() would share the Stopper
between the server and the tenant, and it would also proceed to give the
Tenant a new Tracer. The stopper in question has the server's Tracer in
it, and so this sharing is no bueno, because the tenant ends up having
two different Tracers, and it uses either of them fairly arbitrarily at
different points. Intermingling two Tracers like this was never
intended, and attempting to create a child with a different tracer then
the parent will become illegal shortly. This shows that sharing stoppers
between different "server" is a bad idea. This patch stops the practice.

The patch also removes the tracer config options for test tenants
because it was unused and supporting it in combination with the stopper
configuration would require more sanity checks.

Release note: None

Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Andrei Matei <[email protected]>
@github-actions
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-observability C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv KV Team
Projects
None yet
Development

No branches or pull requests

1 participant