Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util/tracing: optimize gprc span creation for Batch and DistSQL #76199

Merged

Conversation

andreimatei
Copy link
Contributor

@andreimatei andreimatei commented Feb 7, 2022

The WithRemoteParent(SpanMeta) span creation option allocates.
This patch adds a flavor that doesn't allocate that takes in an
*tracingpb.TraceInfo, which is the structure passed in some RPC requests
(kv and DistSQL). The TraceInfo is already on the heap, so we can take
advantage of that. This saves one allocation in these RPCs.

name                        old time/op    new time/op    delta
SetupSpanForIncomingRPC-32     442ns ± 0%     308ns ± 1%  -30.24%  (p=0.008 n=5+5)
name                        old alloc/op   new alloc/op   delta
SetupSpanForIncomingRPC-32      145B ± 0%       49B ± 0%  -66.21%  (p=0.008 n=5+5)
name                        old allocs/op  new allocs/op  delta
SetupSpanForIncomingRPC-32      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.008 n=5+5)

Release note: None

@andreimatei andreimatei requested a review from a team as a code owner February 7, 2022 21:31
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@andreimatei
Copy link
Contributor Author

First two commits are #76174. Included here because they add a benchmark that demonstrates the benefit.

@andreimatei andreimatei force-pushed the tracing.WithRemoteParent-no-alloc branch from 58c1b48 to 94d1492 Compare February 7, 2022 21:37
@andreimatei andreimatei force-pushed the tracing.WithRemoteParent-no-alloc branch from 94d1492 to 8710f9e Compare February 10, 2022 18:46
andreimatei added a commit to andreimatei/cockroach that referenced this pull request Feb 10, 2022
This patch defaults the COCKROACH_REAL_SPANS and
COCKROACH_REUSE_TRACING_SPANS env vars to true. This causes tracing
spans to be created even for operations that are not explicitly traced
(i.e. operations for which a "trace recording" has not been requested).
This was already the case in unit tests. Creating the tracing spans is
nice because, which a span is alive, it is present in the active spans
registry which can be queried in various ways (e.g. through the
crdb_internal.node_inflight_trace_spans virtual table, and through the
/debug/tracez console page). I've got big plans for exposing this
registry.

I believe this change to have almost no measurable performance impact.
Some unittest benchmarks:
```
make bench PKG=./pkg/sql/tests BENCHES='BenchmarkKV///rows=1$$'

name                        old time/op  new time/op  delta
KV/Insert/Native/rows=1-32  92.2µs ±11%  88.0µs ± 6%    ~     (p=0.421 n=5+5)
KV/Insert/SQL/rows=1-32      322µs ± 6%   312µs ± 4%    ~     (p=0.151 n=5+5)
KV/Update/Native/rows=1-32   114µs ± 0%   118µs ± 3%  +3.64%  (p=0.016 n=4+5)
KV/Update/SQL/rows=1-32      398µs ± 2%   402µs ± 5%    ~     (p=0.690 n=5+5)
KV/Delete/Native/rows=1-32  94.3µs ± 2%  91.0µs ± 5%    ~     (p=0.056 n=5+5)
KV/Delete/SQL/rows=1-32      266µs ±10%   260µs ±17%    ~     (p=0.690 n=5+5)
KV/Scan/Native/rows=1-32    19.4µs ± 3%  21.0µs ± 5%  +8.01%  (p=0.008 n=5+5)
KV/Scan/SQL/rows=1-32        194µs ± 8%   201µs ± 5%    ~     (p=0.421 n=5+5)
```

And 30 runs of kv95 and kv0 roachtests with results processed through benchstat:
```
kv95:
name                    old ops/sec  new ops/sec  delta
kv95/enc=false/nodes=3   29.6k ± 4%   29.2k ± 8%  -1.33%  (p=0.018 n=30+30)

kv0:
name                   old ops/sec  new ops/sec  delta
kv0/enc=false/nodes=3   13.6k ± 4%   13.5k ± 3%    ~     (p=0.217 n=30+27)
```

This measurements were taken with cockroachdb#76199 - a last pending performance
optimization for the tracing library in a long chain of them.

It's possible that this changes causes some roachtests to fail with span
use-after-finish assertions; roachtests are configured to crash on these
assertions, unlike production. But I don't expect that to happen (at
least, not too much), since unit tests have been running like this for a
while. I've also kick off a roachtest run; it's far from done, but so
far everything looks good.

Release note: The crdb_internal.node_inflight_trace_spans virtual table
will now present traces for all operations ongoing on the respective
node. Before, the table was reflecting a small percentage of ongoing
operations unless tracing was explicitly enabled.
Copy link
Contributor

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but maybe @nvanbenschoten should also have a look?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)


-- commits, line 9 at r5:
imported?


pkg/server/node.go, line 1137 at r6 (raw file):

			}
			ctx, newSpan = tr.StartSpanCtx(ctx, tracing.BatchMethodName,
				tracing.WithRemoteParent(remoteParent),

For my understanding, where else are we not able to use the FromTraceInfo variant? It was a little hard for me to trace this out, so do you think we can have a comment over WithRemoteParent that outlines the scenario(s) it's now used in?

@andreimatei andreimatei force-pushed the tracing.WithRemoteParent-no-alloc branch from 8710f9e to 56f78fc Compare February 11, 2022 21:08
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @nvanbenschoten)


-- commits, line 9 at r5:

Previously, aayushshah15 (Aayush Shah) wrote…

imported?

indeed. done


pkg/server/node.go, line 1137 at r6 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

For my understanding, where else are we not able to use the FromTraceInfo variant? It was a little hard for me to trace this out, so do you think we can have a comment over WithRemoteParent that outlines the scenario(s) it's now used in?

I've update the comment on WithRemoteParent.

We can/should only use WithRemoteParentFromTraceInfo when we have a TraceInfo that already lives on the heap (i.e. one can always create a TraceInfo, but if it does so only to pass it to WithRemoteParentFromTraceInfo, that defeats the purpose since it'll escape). There are two situation where we have TraceInfo's on the heap: in the Batch RPC and in the SetupFlow RPCs; the TraceInfo is part of the RPC request proto, and the proto is naturally on the heap.

Besides these two cases, there are a couple of other cases where we want to use WithRemoteParent* and we do not have a TraceInfo. In all these cases, we have a SpanMeta is synthesized from grpc metada. All these places allocate.

  1. in the gRPC interceptor that handles tracing for most RPCs (but not for Batch and SetupFlow).
  2. in the RPC handlers for Batch (i.e. right here) and SetupFlow, in the backwards compatibility case.
  3. in the Raft application path, where we marshal the trace info in a very inefficient way. I'll probably do something about this. The respective code is only enabled for verbose tracing, at the moment.

Copy link
Contributor

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15, @andreimatei, and @nvanbenschoten)


pkg/server/node.go, line 1137 at r6 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I've update the comment on WithRemoteParent.

We can/should only use WithRemoteParentFromTraceInfo when we have a TraceInfo that already lives on the heap (i.e. one can always create a TraceInfo, but if it does so only to pass it to WithRemoteParentFromTraceInfo, that defeats the purpose since it'll escape). There are two situation where we have TraceInfo's on the heap: in the Batch RPC and in the SetupFlow RPCs; the TraceInfo is part of the RPC request proto, and the proto is naturally on the heap.

Besides these two cases, there are a couple of other cases where we want to use WithRemoteParent* and we do not have a TraceInfo. In all these cases, we have a SpanMeta is synthesized from grpc metada. All these places allocate.

  1. in the gRPC interceptor that handles tracing for most RPCs (but not for Batch and SetupFlow).
  2. in the RPC handlers for Batch (i.e. right here) and SetupFlow, in the backwards compatibility case.
  3. in the Raft application path, where we marshal the trace info in a very inefficient way. I'll probably do something about this. The respective code is only enabled for verbose tracing, at the moment.

👍 thanks

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 10 of 10 files at r7, 10 of 10 files at r8, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15, @andreimatei, and @nvanbenschoten)


pkg/util/tracing/span_options.go, line 252 at r8 (raw file):

// be added to the StartSpan invocation.
//
// If you're in posession of a TraceInfo instead of a SpanMeta, prefer using

If WithRemoteParentFromTraceInfo is now the preferred method of setting a RemoteParent, do you think it's worth renaming this instead to indicate that it should be avoided when possible? Is this one still more common?


pkg/util/tracing/span_options.go, line 277 at r8 (raw file):

}

// WithRemoteParentFromTraceInfo is like WithRemoteParent, except the remote

nit: the construct usually precedes the apply method in this file. Consider reordering for consistency.


pkg/util/tracing/tracer.go, line 1125 at r7 (raw file):

		// the moment, but let's not rely on that.
		if !opts.Parent.empty() && opts.Parent.i.crdb != nil {
			s.i.crdb.parentSpanID = opts.parentSpanID()

I don't understand this code, but will we enter this branch if !opts.RemoteParent.Empty()?

We weren't setting the parent id on spans with remote parents. This
patch fixes that, because it's the right thing to do and also because
there were comments that implied that we did.
There were no major consequences of the fact that we didn't set it
because the importing of a remote recording was overwriting the parent
id of the first importeed span.

The next patch has a test for this.

Release note: None
The WithRemoteParent(SpanMeta) span creation option allocates.
This patch adds a flavor that doesn't allocate that takes in an
*tracingpb.TraceInfo, which is the structure passed in some RPC requests
(kv and DistSQL). The TraceInfo is already on the heap, so we can take
advantage of that. This saves one allocation in these RPCs.

```
name                        old time/op    new time/op    delta
SetupSpanForIncomingRPC-32     442ns ± 0%     308ns ± 1%  -30.24%  (p=0.008 n=5+5)
name                        old alloc/op   new alloc/op   delta
SetupSpanForIncomingRPC-32      145B ± 0%       49B ± 0%  -66.21%  (p=0.008 n=5+5)
name                        old allocs/op  new allocs/op  delta
SetupSpanForIncomingRPC-32      2.00 ± 0%      1.00 ± 0%  -50.00%  (p=0.008 n=5+5)
```

Release note: None
@andreimatei andreimatei force-pushed the tracing.WithRemoteParent-no-alloc branch from 56f78fc to dce4cec Compare February 11, 2022 22:34
Copy link
Contributor Author

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15, @andreimatei, and @nvanbenschoten)


pkg/util/tracing/span_options.go, line 252 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

If WithRemoteParentFromTraceInfo is now the preferred method of setting a RemoteParent, do you think it's worth renaming this instead to indicate that it should be avoided when possible? Is this one still more common?

This one is still more common, counting by number of callers. There are more ways to get your hands on a SpanMeta.
I've renamed it to WithRemoteParentFromSpanMeta, to avoid the appearance of favoritism.
I will try to merge SpanMeta with TraceInfo in the future. I haven't made up my mind exactly; there's minor differences between the two.


pkg/util/tracing/span_options.go, line 277 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: the construct usually precedes the apply method in this file. Consider reordering for consistency.

done


pkg/util/tracing/tracer.go, line 1125 at r7 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I don't understand this code, but will we enter this branch if !opts.RemoteParent.Empty()?

This line move out of the if in the next commit. Fixed this commit now.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 10 files at r9, 14 of 14 files at r10, all commit messages.
Reviewable status: :shipit: complete! 2 of 0 LGTMs obtained (waiting on @aayushshah15, @andreimatei, and @nvanbenschoten)

andreimatei added a commit to andreimatei/cockroach that referenced this pull request Feb 11, 2022
This patch defaults the COCKROACH_REAL_SPANS and
COCKROACH_REUSE_TRACING_SPANS env vars to true. This causes tracing
spans to be created even for operations that are not explicitly traced
(i.e. operations for which a "trace recording" has not been requested).
This was already the case in unit tests. Creating the tracing spans is
nice because, which a span is alive, it is present in the active spans
registry which can be queried in various ways (e.g. through the
crdb_internal.node_inflight_trace_spans virtual table, and through the
/debug/tracez console page). I've got big plans for exposing this
registry.

I believe this change to have almost no measurable performance impact.
Some unittest benchmarks:
```
make bench PKG=./pkg/sql/tests BENCHES='BenchmarkKV///rows=1$$'

name                        old time/op  new time/op  delta
KV/Insert/Native/rows=1-32  92.2µs ±11%  88.0µs ± 6%    ~     (p=0.421 n=5+5)
KV/Insert/SQL/rows=1-32      322µs ± 6%   312µs ± 4%    ~     (p=0.151 n=5+5)
KV/Update/Native/rows=1-32   114µs ± 0%   118µs ± 3%  +3.64%  (p=0.016 n=4+5)
KV/Update/SQL/rows=1-32      398µs ± 2%   402µs ± 5%    ~     (p=0.690 n=5+5)
KV/Delete/Native/rows=1-32  94.3µs ± 2%  91.0µs ± 5%    ~     (p=0.056 n=5+5)
KV/Delete/SQL/rows=1-32      266µs ±10%   260µs ±17%    ~     (p=0.690 n=5+5)
KV/Scan/Native/rows=1-32    19.4µs ± 3%  21.0µs ± 5%  +8.01%  (p=0.008 n=5+5)
KV/Scan/SQL/rows=1-32        194µs ± 8%   201µs ± 5%    ~     (p=0.421 n=5+5)
```

And 30 runs of kv95 and kv0 roachtests with results processed through benchstat:
```
kv95:
name                    old ops/sec  new ops/sec  delta
kv95/enc=false/nodes=3   29.6k ± 4%   29.2k ± 8%  -1.33%  (p=0.018 n=30+30)

kv0:
name                   old ops/sec  new ops/sec  delta
kv0/enc=false/nodes=3   13.6k ± 4%   13.5k ± 3%    ~     (p=0.217 n=30+27)
```

This measurements were taken with cockroachdb#76199 - a last pending performance
optimization for the tracing library in a long chain of them.

It's possible that this changes causes some roachtests to fail with span
use-after-finish assertions; roachtests are configured to crash on these
assertions, unlike production. But I don't expect that to happen (at
least, not too much), since unit tests have been running like this for a
while. I've also kick off a roachtest run; it's far from done, but so
far everything looks good.

Release note: The crdb_internal.node_inflight_trace_spans virtual table
will now present traces for all operations ongoing on the respective
node. Before, the table was reflecting a small percentage of ongoing
operations unless tracing was explicitly enabled.
@andreimatei
Copy link
Contributor Author

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 13, 2022

Build succeeded:

@craig craig bot merged commit c2eec06 into cockroachdb:master Feb 13, 2022
@andreimatei andreimatei deleted the tracing.WithRemoteParent-no-alloc branch February 13, 2022 23:05
craig bot pushed a commit that referenced this pull request Feb 13, 2022
76403: util/tracing: enable the trace registry beyond tests r=andreimatei a=andreimatei

This patch defaults the COCKROACH_REAL_SPANS and
COCKROACH_REUSE_TRACING_SPANS env vars to true. This causes tracing
spans to be created even for operations that are not explicitly traced
(i.e. operations for which a "trace recording" has not been requested).
This was already the case in unit tests. Creating the tracing spans is
nice because, which a span is alive, it is present in the active spans
registry which can be queried in various ways (e.g. through the
crdb_internal.node_inflight_trace_spans virtual table, and through the
/debug/tracez console page). I've got big plans for exposing this
registry.

I believe this change to have almost no measurable performance impact.
Some unittest benchmarks:
```
make bench PKG=./pkg/sql/tests BENCHES='BenchmarkKV///rows=1$$'

name                        old time/op  new time/op  delta
KV/Insert/Native/rows=1-32  92.2µs ±11%  88.0µs ± 6%    ~     (p=0.421 n=5+5)
KV/Insert/SQL/rows=1-32      322µs ± 6%   312µs ± 4%    ~     (p=0.151 n=5+5)
KV/Update/Native/rows=1-32   114µs ± 0%   118µs ± 3%  +3.64%  (p=0.016 n=4+5)
KV/Update/SQL/rows=1-32      398µs ± 2%   402µs ± 5%    ~     (p=0.690 n=5+5)
KV/Delete/Native/rows=1-32  94.3µs ± 2%  91.0µs ± 5%    ~     (p=0.056 n=5+5)
KV/Delete/SQL/rows=1-32      266µs ±10%   260µs ±17%    ~     (p=0.690 n=5+5)
KV/Scan/Native/rows=1-32    19.4µs ± 3%  21.0µs ± 5%  +8.01%  (p=0.008 n=5+5)
KV/Scan/SQL/rows=1-32        194µs ± 8%   201µs ± 5%    ~     (p=0.421 n=5+5)
```

And 30 runs of kv95 and kv0 roachtests with results processed through benchstat:
```
kv95:
name                    old ops/sec  new ops/sec  delta
kv95/enc=false/nodes=3   29.6k ± 4%   29.2k ± 8%  -1.33%  (p=0.018 n=30+30)

kv0:
name                   old ops/sec  new ops/sec  delta
kv0/enc=false/nodes=3   13.6k ± 4%   13.5k ± 3%    ~     (p=0.217 n=30+27)
```

This measurements were taken with #76199 - a last pending performance
optimization for the tracing library in a long chain of them.

It's possible that this changes causes some roachtests to fail with span
use-after-finish assertions; roachtests are configured to crash on these
assertions, unlike production. But I don't expect that to happen (at
least, not too much), since unit tests have been running like this for a
while. I've also kick off a roachtest run; it's far from done, but so
far everything looks good.

Release note: The crdb_internal.node_inflight_trace_spans virtual table
will now present traces for all operations ongoing on the respective
node. Before, the table was reflecting a small percentage of ongoing
operations unless tracing was explicitly enabled.

Co-authored-by: Andrei Matei <[email protected]>
RajivTS pushed a commit to RajivTS/cockroach that referenced this pull request Mar 6, 2022
This patch defaults the COCKROACH_REAL_SPANS and
COCKROACH_REUSE_TRACING_SPANS env vars to true. This causes tracing
spans to be created even for operations that are not explicitly traced
(i.e. operations for which a "trace recording" has not been requested).
This was already the case in unit tests. Creating the tracing spans is
nice because, which a span is alive, it is present in the active spans
registry which can be queried in various ways (e.g. through the
crdb_internal.node_inflight_trace_spans virtual table, and through the
/debug/tracez console page). I've got big plans for exposing this
registry.

I believe this change to have almost no measurable performance impact.
Some unittest benchmarks:
```
make bench PKG=./pkg/sql/tests BENCHES='BenchmarkKV///rows=1$$'

name                        old time/op  new time/op  delta
KV/Insert/Native/rows=1-32  92.2µs ±11%  88.0µs ± 6%    ~     (p=0.421 n=5+5)
KV/Insert/SQL/rows=1-32      322µs ± 6%   312µs ± 4%    ~     (p=0.151 n=5+5)
KV/Update/Native/rows=1-32   114µs ± 0%   118µs ± 3%  +3.64%  (p=0.016 n=4+5)
KV/Update/SQL/rows=1-32      398µs ± 2%   402µs ± 5%    ~     (p=0.690 n=5+5)
KV/Delete/Native/rows=1-32  94.3µs ± 2%  91.0µs ± 5%    ~     (p=0.056 n=5+5)
KV/Delete/SQL/rows=1-32      266µs ±10%   260µs ±17%    ~     (p=0.690 n=5+5)
KV/Scan/Native/rows=1-32    19.4µs ± 3%  21.0µs ± 5%  +8.01%  (p=0.008 n=5+5)
KV/Scan/SQL/rows=1-32        194µs ± 8%   201µs ± 5%    ~     (p=0.421 n=5+5)
```

And 30 runs of kv95 and kv0 roachtests with results processed through benchstat:
```
kv95:
name                    old ops/sec  new ops/sec  delta
kv95/enc=false/nodes=3   29.6k ± 4%   29.2k ± 8%  -1.33%  (p=0.018 n=30+30)

kv0:
name                   old ops/sec  new ops/sec  delta
kv0/enc=false/nodes=3   13.6k ± 4%   13.5k ± 3%    ~     (p=0.217 n=30+27)
```

This measurements were taken with cockroachdb#76199 - a last pending performance
optimization for the tracing library in a long chain of them.

It's possible that this changes causes some roachtests to fail with span
use-after-finish assertions; roachtests are configured to crash on these
assertions, unlike production. But I don't expect that to happen (at
least, not too much), since unit tests have been running like this for a
while. I've also kick off a roachtest run; it's far from done, but so
far everything looks good.

Release note: The crdb_internal.node_inflight_trace_spans virtual table
will now present traces for all operations ongoing on the respective
node. Before, the table was reflecting a small percentage of ongoing
operations unless tracing was explicitly enabled.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants