Update collector code to use [Trace|Metrics]Data. #431

bogdandrutu · 2019-02-22T02:46:27Z

No description provided.

sjkaris · 2019-02-22T02:50:27Z

I think we should change data.TraceData to contain an ExportTraceServiceRequest instead of the objects it contains, so that we don't have to change all these files in the event that something is added to the proto request object.

pjanotti · 2019-02-22T15:49:34Z

cmd/occollector/app/sender/jaeger_thrift_http_sender.go

@@ -80,15 +80,11 @@ func NewJaegerThriftHTTPSender(
 }

 // ProcessSpans sends the received data to the configured Jaeger Thrift end-point.
-func (s *JaegerThriftHTTPSender) ProcessSpans(batch *agenttracepb.ExportTraceServiceRequest, spanFormat string) (uint64, error) {
+func (s *JaegerThriftHTTPSender) ProcessSpans(td data.TraceData, spanFormat string) (uint64, error) {


q.: why passing td by value? in the bigger scheme of things this shouldn't make much difference either way but I am to know what it the reasoning behind the choice.

Multiple reasons to use value:

Size is small because we embed pointers.

Avoid heap allocation of this object

Passing by value means that the next processor cannot modify it, for example if you have a multi processor than fans this out you don't want the first processor to modify the object.

I don't know if I really see the benefit of point 3. Since we are carrying pointers to the underlying objects, if those objects are changed, those changes will be reflected in all other processors, so the first processor could effectively modify the object. td's pointers could not be changed to point at new objects i guess, but i think its very unlikely someone tries to do that instead of just modifying the objects it points to.

That said, I do not have a problem with passing td by value since its a very small object as you said, and it does avoid the heap allocation of a very pointer-filled object.

Yeah, I'm not worrying about #1 the size of the structure versus the pointer on the call (it is a small structure), this only becomes a concern on rare cases.

For #2 when we receive the large chunk of the data is on the heap but more to the point passing the pointer down the pipe shouldn't cause extra allocation, anyway, probably the difference here is marginal either way.

For #3 some processors can change the data, e.g.: adding global-tags, since the data contain maps that's not really avoidable without deep copies. I think that having it as a pointer make clear that changes will be seen by others. I see the problem with parallel stages on the pipe modifying the data but I don't think that this protects us against that.

We can always change back to pointer if we determine any problem, but this is also compatible with the https://github.com/census-instrumentation/opencensus-service/blob/master/processor/processor.go so I want to make your processor be the same as that one and simply remove your own processor.

Agreed, we can change it later if we decide to do so.

Avoid heap allocation of this object

The heap vs stack allocation argument in other languages doesn't work for Go. A value could be allocated on the heap regardless of if it is a pointer or not, and it is only dependent upon escape analysis that can be determined at compile time. For example, given a non-pointer value used in two different contexts

package main import "fmt" func foo() { x := "abcdef" fmt.Println(x) } func bar() { x := "abcdef" println(x) }

when compiled with

go tool compile -m main.go

produces

main.go:5:6: can inline foo main.go:7:13: inlining call to fmt.Println main.go:10:6: can inline bar main.go:7:13: x escapes to heap main.go:7:13: io.Writer(os.Stdout) escapes to heap main.go:7:13: foo []interface {} literal does not escape <autogenerated>:1: os.(*File).close .this does not escape

The only way to safely ensure it doesn't escape is to use the //go:noescape pragma directive but that's quite unnecessary and will riddle this code with too much magic.

Passing by value means that the next processor cannot modify it

Passing by value is going to always create a copy of the variable, so let's just be ware of that for the future about how much memory is going to be copied around. I think let's ensure that no processor starts modifying its arguments when passing by pointers. I'd be shocked if any of the Go code in this repository (or OpenCensus repositories) modifies arguments.

I personally wasn't enthusiastic about the change to use MetricData and TraceData as non-pointers but I let it slide as I found the change was already merged but that's alright as it was minimal.

Overall though, sure I think this change should help unify things and should go in. We can switch to using pointers later on if need but I just wanted to clarify that we should be careful about the assumptions on benefits of pointer vs non-pointer as those don't apply necessarily in Go. Anyways since no one seems to be importing these packages, we can switch back as warranted.

Thanks @odeke-em for the explanation. I agree that we can change this since we don't make any guarantee on the interfaces stability for the moment.

bogdandrutu · 2019-02-23T00:27:03Z

@sjkaris I disagree with the statement. If you add a new field in the proto you still need to change all the files to use the new field if necessary, if that field is not important for all the processors you change only the once that need to use the new field. The extra change that we need right now without embedding the main proto is that we have to add the new field in the TraceData.

sjkaris · 2019-02-23T00:54:03Z

@bogdandrutu Hmm, if a field is added to the proto request, and then the proto-lib is updated, we will pass the new proto value along for free if we embed the request struct, with no code changes needed (for the oc to oc flow). If we don't do this, it is required that we update our data model to include this new field. On the flip, what is the downside of using the full object rather than just its fields?

bogdandrutu · 2019-02-23T01:08:53Z

@sjkaris who is populated that field? If the proto comes from the wire then we need to also update the OCTrace receiver, I agree.

The downside is that the request proto is design to be "network friendly" for example the Resource is not present if it is the same as previous message, or things like that, which I think we already take care in the receiver. Probably it makes sense in the OCTrace receiver to actually populate the Resource in each span then TraceData will be only the Spans.

bogdandrutu · 2019-02-23T02:05:40Z

Ready for final review.

codecov · 2019-02-23T03:20:33Z

Codecov Report

Merging #431 into master will decrease coverage by 0.19%.
The diff coverage is 65.06%.

@@           Coverage Diff            @@
##           master    #431     +/-   ##
========================================
- Coverage    63.6%   63.4%   -0.2%     
========================================
  Files          41      42      +1     
  Lines        3363    3375     +12     
========================================
+ Hits         2139    2140      +1     
- Misses       1069    1081     +12     
+ Partials      155     154      -1

Impacted Files	Coverage Δ
internal/collector/processor/exporter_processor.go	`0% <0%> (ø)`	⬆️
processor/processortest/concurrent_sinks.go	`0% <0%> (ø)`
internal/collector/processor/processor_to_sink.go	`0% <0%> (ø)`	⬆️
internal/collector/processor/processor.go	`0% <0%> (ø)`	⬆️
internal/collector/processor/multi_processor.go	`81.63% <100%> (ø)`	⬆️
...anslator/trace/jaeger/protospan_to_jaegerthrift.go	`82.08% <100%> (+0.6%)`	⬆️
translator/trace/zipkin/zipkinv1_to_protospan.go	`89.16% <100%> (ø)`	⬆️
receiver/jaegerreceiver/trace_receiver.go	`80.74% <100%> (ø)`	⬆️
...lator/trace/zipkin/zipkinv1_thrift_to_protospan.go	`74.32% <100%> (ø)`	⬆️
...al/collector/processor/nodebatcher/node_batcher.go	`78.18% <100%> (+0.45%)`	⬆️
... and 7 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bad985a...75f7dbc. Read the comment docs.

…tion#431)

bogdandrutu requested review from sjkaris and pjanotti February 22, 2019 02:46

bogdandrutu requested a review from a team as a code owner February 22, 2019 02:46

googlebot added the cla: yes label Feb 22, 2019

bogdandrutu requested a review from songy23 February 22, 2019 02:46

flands assigned bogdandrutu Feb 22, 2019

flands added this to the 0.2.0 milestone Feb 22, 2019

pjanotti reviewed Feb 22, 2019

View reviewed changes

bogdandrutu force-pushed the updatecollector branch from 1c5e141 to d34eb7a Compare February 22, 2019 22:29

bogdandrutu force-pushed the updatecollector branch from d34eb7a to 8387c2a Compare February 23, 2019 01:23

Update collector code to use [Trace|Metrics]Data.

f76032b

bogdandrutu force-pushed the updatecollector branch from 8387c2a to f76032b Compare February 23, 2019 01:58

Merge branch 'master' into updatecollector

75f7dbc

pjanotti approved these changes Feb 23, 2019

View reviewed changes

bogdandrutu merged commit c54ee82 into census-instrumentation:master Feb 25, 2019

bogdandrutu deleted the updatecollector branch February 25, 2019 01:02

fivesheep pushed a commit to fivesheep/opencensus-service that referenced this pull request Jun 12, 2019

Update collector code to use [Trace|Metrics]Data. (census-instrumenta…

0f5f625

…tion#431)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update collector code to use [Trace|Metrics]Data. #431

Update collector code to use [Trace|Metrics]Data. #431

bogdandrutu commented Feb 22, 2019 •

edited

Loading

sjkaris commented Feb 22, 2019

pjanotti Feb 22, 2019

bogdandrutu Feb 23, 2019

sjkaris Feb 23, 2019

pjanotti Feb 23, 2019

bogdandrutu Feb 23, 2019

pjanotti Feb 23, 2019

odeke-em Feb 24, 2019

bogdandrutu Feb 25, 2019

bogdandrutu commented Feb 23, 2019

sjkaris commented Feb 23, 2019

bogdandrutu commented Feb 23, 2019

bogdandrutu commented Feb 23, 2019

codecov bot commented Feb 23, 2019

Update collector code to use [Trace|Metrics]Data. #431

Update collector code to use [Trace|Metrics]Data. #431

Conversation

bogdandrutu commented Feb 22, 2019 • edited Loading

sjkaris commented Feb 22, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bogdandrutu commented Feb 23, 2019

sjkaris commented Feb 23, 2019

bogdandrutu commented Feb 23, 2019

bogdandrutu commented Feb 23, 2019

codecov bot commented Feb 23, 2019

Codecov Report

bogdandrutu commented Feb 22, 2019 •

edited

Loading