Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql,tracing: introduce crdb_internal.cluster_inflight_traces #66679

Merged
merged 1 commit into from
Jun 28, 2021

Conversation

adityamaru
Copy link
Contributor

@adityamaru adityamaru commented Jun 21, 2021

This change adds a new indexed, virtual table
crdb_internal.cluster_inflight_traces. This table surfaces
cluster-wide inflight traces for the trace_id specified via
an index constraint.

Each row in the virtual table corresponds to a
tracing.Recording on a particular node for the given
trace ID. A tracing.Recording is the trace of a single
operation rooted at a root span on that node. Under the hood,
the virtual table contacts all "live" nodes in the cluster
via the trace collector which streams back a recording at a
time.

The table has 3 additional columns that surface the raw JSON,
string, and JaegarJSON representation of the recording. These
formats are what we dump in a stmt bundle as well, and have
been considered the best way to consume traces. This table
is not meant to be consumed directly via the SQL shell but
will have CLI wrapper built on top of it that will assimilate
and write the traces to files. Similar to how we dump a stmt
bundle.

This change also tweaks some of the recording->string methods
to include StructuredRecords.

Informs: #64992

Release note (sql change): adds a virtual table
crdb_internal.cluster_inflight_traces which surfaces
cluster-wide inflight traces for the trace_id specified via
an index constraint. The output of this table is not appropriate
to consume over a SQL connection; follow up changes will add
CLI wrappers to make the interaction more user-friendly.

@adityamaru adityamaru requested review from knz, abarganier and a team June 21, 2021 18:31
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@adityamaru
Copy link
Contributor Author

First commit is from #66678, will rebase once that is merged!

Copy link
Contributor

@abarganier abarganier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 16 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru and @knz)


pkg/sql/crdb_internal.go, line 1261 at r2 (raw file):

return false, nil

Should we be returning true if any rows were generated?


pkg/sql/crdb_internal_test.go, line 897 at r2 (raw file):

			[][]string{{strconv.Itoa(int(traceID)), `1`}, {strconv.Itoa(int(traceID)), `1`},
				{strconv.Itoa(int(traceID)), `2`}})

Is it worth the trouble to make assertions against the recording columns as well?


pkg/sql/explain_bundle.go, line 312 at r2 (raw file):

select JSON File

nit: "select the JSON file"


pkg/util/tracing/recording.go, line 220 at r2 (raw file):

Quoted 17 lines of code…
	// If the span was verbose then the Structured events would have been
	// stringified and included in the Logs above. If the span was not verbose
	// we should add the Structured events now.
	if !isVerbose(sp) {
		sp.Structured(func(sr *types.Any, t time.Time) {
			lr := opentracing.LogRecord{
				Timestamp: t,
			}
			str, err := MessageToJSONString(sr, true /* emitDefaults */)
			if err != nil {
				return
			}
			lr.Fields = append(lr.Fields, otlog.String("structured", str))
			lastLog := ownLogs[len(ownLogs)-1]
			ownLogs = append(ownLogs, conv(lr, lastLog.Timestamp))
		})
	}

I'm surprised we weren't including Structured events to some degree before in this non-verbose case, both here and in ToJaegerJSON(). Seems like verbosity doesn't impact the recording of structured events though, so it should be ok? Just checking.


pkg/util/tracing/utils.go, line 1 at r2 (raw file):

// Copyright 2021 The Cockroach Authors.

nit: rename file to something more explicit, e.g. json_conv.go

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objection on the overall idea, but I'm not seeing any memory monitoring for the trace collector. What mechanism ensures that a node won't explode in RAM usage if there is a lot of trace data to collect?

Reviewed 14 of 16 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru)

@adityamaru
Copy link
Contributor Author

No objection on the overall idea, but I'm not seeing any memory monitoring for the trace collector. What mechanism ensures that a node won't explode in RAM usage if there is a lot of trace data to collect?

@knz the trace collector is an iterator that only buffers a single node's recordings at a time. It serves from these buffered recordings until there are no more, and only then contacts the next node for its recordings. If I understand the vtable infrastructure correctly, it is also lazy in the way it pushes rows. i.e it will only buffer the datums of a single row (in our case a single trace recording at a time), until the consumer consumes the row. So at most we should be buffering the O(inflight trace recordings of a single node) at a time. I agree that memory monitoring even within this buffering is advisable but just wanted to clarify the current behavior.

@knz
Copy link
Contributor

knz commented Jun 24, 2021

Maybe you could sprinkle the code with comments in various places to explain memory consumption.

@adityamaru adityamaru force-pushed the stmt-bundle-prototype branch from 1e7f7e0 to 2a9a5ee Compare June 24, 2021 15:59
@adityamaru adityamaru requested a review from a team as a code owner June 24, 2021 15:59
Copy link
Contributor Author

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @abarganier and @knz)


pkg/sql/crdb_internal.go, line 1261 at r2 (raw file):

Previously, abarganier (Alex Barganier) wrote…
return false, nil

Should we be returning true if any rows were generated?

nice catch, done.


pkg/sql/crdb_internal_test.go, line 897 at r2 (raw file):

Previously, abarganier (Alex Barganier) wrote…
			[][]string{{strconv.Itoa(int(traceID)), `1`}, {strconv.Itoa(int(traceID)), `1`},
				{strconv.Itoa(int(traceID)), `2`}})

Is it worth the trouble to make assertions against the recording columns as well?

We do have tests at the trace collector level to check that we are fetching the correct spans, so I'm inclined to avoid making assertions to check the json conversion/string conversion. In other words we know with a high degree of confidence that nodeID, recording returned on each iteration is correct. Testing the util methods involves a lot of stripping and regexing because of the variable timestamps. Taking a page from explain_bundle_test.go we should check that the size of the column is not zero.


pkg/util/tracing/recording.go, line 220 at r2 (raw file):

Previously, abarganier (Alex Barganier) wrote…
	// If the span was verbose then the Structured events would have been
	// stringified and included in the Logs above. If the span was not verbose
	// we should add the Structured events now.
	if !isVerbose(sp) {
		sp.Structured(func(sr *types.Any, t time.Time) {
			lr := opentracing.LogRecord{
				Timestamp: t,
			}
			str, err := MessageToJSONString(sr, true /* emitDefaults */)
			if err != nil {
				return
			}
			lr.Fields = append(lr.Fields, otlog.String("structured", str))
			lastLog := ownLogs[len(ownLogs)-1]
			ownLogs = append(ownLogs, conv(lr, lastLog.Timestamp))
		})
	}

I'm surprised we weren't including Structured events to some degree before in this non-verbose case, both here and in ToJaegerJSON(). Seems like verbosity doesn't impact the recording of structured events though, so it should be ok? Just checking.

If the span is verbose, which is the only time a statement bundle would be useful, then all structured events are stringified and added to the log records. Now we have a use case where we might want to pull traces from a non-verbose span, that still records structured events, but does not record log events.


pkg/util/tracing/utils.go, line 1 at r2 (raw file):

Previously, abarganier (Alex Barganier) wrote…

nit: rename file to something more explicit, e.g. json_conv.go

we have several other places in the code where we have utils files that seem to become home to general utility methods that don't really have another place to live in the surrounding infrastructure. I don't feel strongly about it either way, but might be good to have a more free-form place for future tracing work? Happy to go either way!

@adityamaru
Copy link
Contributor Author

adityamaru commented Jun 24, 2021

Maybe you could sprinkle the code with comments in various places to explain memory consumption.

Yup, added a comment above the virtual table definition! I do think that this might need to be revisited in the future, if we start seeing a lot of unfinished "root spans" for a given trace_id on each node during bulk job execution. Off the top of my head though, this should not be an issue.

@adityamaru adityamaru requested review from abarganier and knz June 24, 2021 16:07
@knz
Copy link
Contributor

knz commented Jun 25, 2021

no further comments from me

Copy link
Contributor

@abarganier abarganier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @abarganier, @adityamaru, and @knz)


pkg/sql/crdb_internal.go, line 1261 at r2 (raw file):

Previously, adityamaru (Aditya Maru) wrote…

nice catch, done.

Since we're adding rows in a loop, what if addRow is successful on the 1st iteration, but fails on the 2nd? Is this a transactional operation? If not, then would we still want to return true, since the 1st iteration succeeded?


pkg/sql/crdb_internal_test.go, line 897 at r2 (raw file):

Previously, adityamaru (Aditya Maru) wrote…

We do have tests at the trace collector level to check that we are fetching the correct spans, so I'm inclined to avoid making assertions to check the json conversion/string conversion. In other words we know with a high degree of confidence that nodeID, recording returned on each iteration is correct. Testing the util methods involves a lot of stripping and regexing because of the variable timestamps. Taking a page from explain_bundle_test.go we should check that the size of the column is not zero.

NotEmpty checks SGTM! Thanks.


pkg/util/tracing/recording.go, line 220 at r2 (raw file):

Previously, adityamaru (Aditya Maru) wrote…

If the span is verbose, which is the only time a statement bundle would be useful, then all structured events are stringified and added to the log records. Now we have a use case where we might want to pull traces from a non-verbose span, that still records structured events, but does not record log events.

👍 thanks for clarifying


pkg/util/tracing/utils.go, line 1 at r2 (raw file):

Previously, adityamaru (Aditya Maru) wrote…

we have several other places in the code where we have utils files that seem to become home to general utility methods that don't really have another place to live in the surrounding infrastructure. I don't feel strongly about it either way, but might be good to have a more free-form place for future tracing work? Happy to go either way!

If there's precedent elsewhere then I don't feel too strongly about it, thanks for clarifying!


pkg/util/tracing/collector/collector.go, line 40 at r3 (raw file):

mon          *mon.BytesMonitor

unused?

@adityamaru adityamaru force-pushed the stmt-bundle-prototype branch 2 times, most recently from 0b0d59a to 93048fb Compare June 26, 2021 18:40
This change adds a new indexed, virtual table
`crdb_internal.cluster_inflight_traces`. This table surfaces
cluster wide inflight traces for the trace_id specified via
an index constraint.

Each row in the virtual table corresponds to a
`tracing.Recording` on a particular node for the given
trace ID. A `tracing.Recording` is the trace of a single
operation rooted at a root span on that node. Under the hood,
the virtual table contacts all "live" nodes in the cluster
via the trace collector which streams back a recording at a
time.

The table has 3 additional columns that surface the raw JSON,
string, and JaegarJSON represenation of the recording. These
formats are what we dump in a stmt bundle as well, and have
been considered the best way to consume traces. This table
is not meant to be consumed directly via the SQL shell but
will have CLI wrapper built on top of it that will assimilate
and write the traces to files. Simliar to how we dump a stmt
bundle.

This change also tweaks some of the recording->string methods
to include StructuredRecords.

Release note (sql change): adds a virtual table
`crdb_internal.cluster_inflight_traces` which surfaces
cluster wide inflight traces for the trace_id specified via
an index constraint. The output of this table is not appropriate
to consume over a SQL connection; follow up changes will add
CLI wrappers to make the interaction more user friendly.
@adityamaru adityamaru force-pushed the stmt-bundle-prototype branch from 93048fb to 3d5c000 Compare June 27, 2021 22:06
Copy link
Contributor Author

@adityamaru adityamaru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @abarganier, @adityamaru, and @knz)


pkg/sql/crdb_internal.go, line 1261 at r2 (raw file):

Previously, abarganier (Alex Barganier) wrote…

Since we're adding rows in a loop, what if addRow is successful on the 1st iteration, but fails on the 2nd? Is this a transactional operation? If not, then would we still want to return true, since the 1st iteration succeeded?

The vtable generator returns a row as and when a consumer consumes it. There isn't any transactional guarantee, so if the second addRow were to return an error, the consumer would see that error being bubbled up and would hopefully deal with it with an appropriate if err != nil.


pkg/sql/crdb_internal_test.go, line 897 at r2 (raw file):

Previously, abarganier (Alex Barganier) wrote…

NotEmpty checks SGTM! Thanks.

Done.


pkg/util/tracing/collector/collector.go, line 40 at r3 (raw file):

Previously, abarganier (Alex Barganier) wrote…
mon          *mon.BytesMonitor

unused?

yup, deleted.

@adityamaru adityamaru requested a review from abarganier June 28, 2021 04:50
Copy link
Contributor

@abarganier abarganier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: nice work!

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @abarganier, @adityamaru, and @knz)


pkg/sql/crdb_internal.go, line 1261 at r2 (raw file):

Previously, adityamaru (Aditya Maru) wrote…

The vtable generator returns a row as and when a consumer consumes it. There isn't any transactional guarantee, so if the second addRow were to return an error, the consumer would see that error being bubbled up and would hopefully deal with it with an appropriate if err != nil.

Ack

@adityamaru
Copy link
Contributor Author

TFTR!

bors r=abarganier

@craig
Copy link
Contributor

craig bot commented Jun 28, 2021

Build succeeded:

@craig craig bot merged commit 53dcc3a into cockroachdb:master Jun 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants