[processor/deltatocumulative] partial linear pipeline #35048

sh0rez · 2024-09-06T10:15:55Z

Description:
Partially introduces a highly decoupled, linear processing pipeline.
Implemented as a standalone struct to make review easier, will refactor this later.
Instead of overloading Map.Store() to do aggregation, staleness and
limiting, this functionality is now explcitly handled in
ConsumeMetrics.

This highly aids readability and makes understanding this processor a
lot easier, as less mental context needs to be kept.

Notes to reviewer:
See 68dc901 for the main added logic.
Compare processor.go (old, nested) to linear.go (new, linear)

Replaces #34757

Link to tracking Issue: none

Testing: This is a refactor. Existing tests were not modified and still pass

Documentation: not needed

ArthurSens

It looks good to me as a first glance, but it would be awesome if we had some kind of end-to-end test with generated testdata, similar to what we have with intervalprocessor.

Not sure if I'm missing something, but I don't see a test that creates a new deltatocumulative processor through the Factory, call ConsumeMetrics and checks the result. I see for the original processor, but not for linear. I can also see the Chain object you created to call two processors together, but I'm not understanding where exactly it's used xD. In summary, I think you covered the e2e tests with linear but I'm struggling to understand how exactly

internal/exp/metrics/identity/stream.go

processor/deltatocumulativeprocessor/linear.go

sh0rez · 2024-09-10T06:13:23Z

@ArthurSens It looks good to me as a first glance, but it would be awesome if we had some kind of end-to-end test with generated testdata, similar to what we have with intervalprocessor.

It's there, in recently merged processor_test.go:

opentelemetry-collector-contrib/processor/deltatocumulativeprocessor/processor_test.go

Lines 39 to 44 in 52937cf

    
           proc, err := self.NewFactory().CreateMetricsProcessor( 
        
           	context.Background(), 
        
           	processortest.NewNopSettings(), 
        
           	cfg, 
        
           	next, 
        
           )

@ArthurSens I can also see the Chain object you created to call two processors together, but I'm not understanding where exactly it's used xD

It's in factory.go. I first create both processors (Processor and Linear) as usual, then return them as Chain{}. This works, because Chain is a []processor.Metrics, which itself implements processor.Metrics on that slice

opentelemetry-collector-contrib/processor/deltatocumulativeprocessor/factory.go

Line 44 in 52937cf

return Chain{linear, proc}, nil

ArthurSens · 2024-09-10T08:40:58Z

But if I'm reading things correctly, we're only calling the first processor of the Chain

https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/35048/files#diff-5922c3911f502d9f26fbd71672eca70c790535c8e35d4bd61f14b10bafd82d7eR28-R33

EDIT:
ok nevermind, finally understood. One processor is the next of the other, so when we call the first processor in the chain, the first one will call the second 👍

sh0rez · 2024-09-11T08:02:52Z

@RichieSams can you take a look?

RichieSams · 2024-09-12T16:49:08Z

@RichieSams can you take a look?

Yes. Apologies. I've been meaning to take a look at this for a while now. But kept getting waylaid. I'll review this afternoon

processor/deltatocumulativeprocessor/linear.go

internal/exp/metrics/staleness/staleness.go

RichieSams · 2024-09-17T13:20:26Z

internal/exp/metrics/staleness/staleness.go

+	}
+}
+
+func (stale Tracker) Collect(max time.Duration) []identity.Stream {


Minor: Can we rename this to CollectStale()

in processor.go I named it the field stale staleness.Tracker, so using it reads as p.stale.Collect().
Renaming this to p.stale.CollectStale() which I very slightly like less, because it stutters.

no strong opinion

RichieSams · 2024-09-17T13:44:32Z

processor/deltatocumulativeprocessor/factory.go

+	}
+	linear := newLinear(pcfg, ltel, proc)
+
+	return Chain{linear, proc}, nil


Do we still need to chain them? What isn't yet implemented in Linear? IMO it would be much simpler (for metrics, file structure, etc) to just switch wholesale, rather than trying to keep both around.

The deltatocumulative-linear branch didn't have chain. So I was confused at first when reviewing the PR.

Linear only does sums on this branch.

Making linear do everything involves some fairly advanced generics usage, which I think deserves to be reviewed properly and probably separately :)

Given the code already exists on the non-partial branch, I expect to send the next patch right after merging this one.

We should take care to merge so that we only release after merging both.

RichieSams · 2024-09-17T13:50:02Z

Overall, I quite like the code. I personally would vote to do the change wholesale (which I believe is what the deltatocumulative-linear branch does). Rather than one datasource at at time, which requires the chain stuff and more confusion.

adds staleness.Tracker type to `internal/exp/metrics`, which does the same as `staleness.Staleness`, but in a less coupled way

adds metrics for tracking operations of the linear pipeline

Introduces a highly decoupled, linear processing pipeline. Instead of overloading `Map.Store()` to do aggregation, staleness and limiting, this functionality is now explcitly handled in `ConsumeMetrics`. This highly aids readability and makes understanding this processor a lot easier, as less mental context needs to be kept.

Datapoints are first processed by the linear pipeline, and then forwarded to the traditional one for anything not yet implemented

carrieedwards

I like this refactor!

…#35048) **Description:** Partially introduces a highly decoupled, linear processing pipeline. Implemented as a standalone struct to make review easier, will refactor this later. Instead of overloading `Map.Store()` to do aggregation, staleness and limiting, this functionality is now explcitly handled in `ConsumeMetrics`. This highly aids readability and makes understanding this processor a lot easier, as less mental context needs to be kept. *Notes to reviewer*: See [`68dc901`](open-telemetry@68dc901) for the main added logic. Compare `processor.go` (old, nested) to `linear.go` (new, linear) Replaces open-telemetry#34757 **Link to tracking Issue:** none **Testing:** This is a refactor. Existing tests were not modified and still pass **Documentation:** not needed

#### Description As an oversight, #35048 creates two `metadata.TelemetryBuilder` instances. It also introduces an async metric, but one `TelemetryBuilder` sets no callback for that, leading to a panic on `Collect()`. Fixes that by using the same `TelemetryBuilder` for both, properly setting the callback. #### Testing Test was added in first commit that passes after adding second commit

#### Description As an oversight, open-telemetry#35048 creates two `metadata.TelemetryBuilder` instances. It also introduces an async metric, but one `TelemetryBuilder` sets no callback for that, leading to a panic on `Collect()`. Fixes that by using the same `TelemetryBuilder` for both, properly setting the callback. #### Testing Test was added in first commit that passes after adding second commit

#### Description The max_streams default value was changed in #35048 but it was not updated in the readme.

#### Description Finishes work started in #35048 That PR only partially introduced a less complex processor architecture by only using it for Sums. Back then I was not sure of the best way to do it for multiple datatypes, as generics seemed to introduce a lot of complexity regardless of usage. I since then did of a lot of perf analysis and due to the way Go works (see gcshapes), we do not really gain anything at runtime from using generics, given method calls are still dynamic. This implementation uses regular Go interfaces and a good old type switch in the hot path (ConsumeMetrics), which lowers mental complexity quite a lot imo. The value of the new architecture is backed up by the following benchmark: ``` goos: linux goarch: arm64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/processor/deltatocumulativeprocessor │ sums.nested │ sums.linear │ │ sec/op │ sec/op vs base │ Processor/sums-8 56.35µ ± 1% 39.99µ ± 1% -29.04% (p=0.000 n=10) │ sums.nested │ sums.linear │ │ B/op │ B/op vs base │ Processor/sums-8 11.520Ki ± 0% 3.683Ki ± 0% -68.03% (p=0.000 n=10) │ sums.nested │ sums.linear │ │ allocs/op │ allocs/op vs base │ Processor/sums-8 365.0 ± 0% 260.0 ± 0% -28.77% (p=0.000 n=10) ```  #### Testing This is a refactor, existing tests pass unaltered.  #### Documentation not needed

#### Description Removes the nested (aka overloading `streams.Map`) implementation. This has been entirely replaced by a leaner, "linear" implementation: - #35048 - #36486  #### Testing Existing tests continue to pass unaltered  #### Documentation not needed

#### Description Finishes work started in open-telemetry#35048 That PR only partially introduced a less complex processor architecture by only using it for Sums. Back then I was not sure of the best way to do it for multiple datatypes, as generics seemed to introduce a lot of complexity regardless of usage. I since then did of a lot of perf analysis and due to the way Go works (see gcshapes), we do not really gain anything at runtime from using generics, given method calls are still dynamic. This implementation uses regular Go interfaces and a good old type switch in the hot path (ConsumeMetrics), which lowers mental complexity quite a lot imo. The value of the new architecture is backed up by the following benchmark: ``` goos: linux goarch: arm64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/processor/deltatocumulativeprocessor │ sums.nested │ sums.linear │ │ sec/op │ sec/op vs base │ Processor/sums-8 56.35µ ± 1% 39.99µ ± 1% -29.04% (p=0.000 n=10) │ sums.nested │ sums.linear │ │ B/op │ B/op vs base │ Processor/sums-8 11.520Ki ± 0% 3.683Ki ± 0% -68.03% (p=0.000 n=10) │ sums.nested │ sums.linear │ │ allocs/op │ allocs/op vs base │ Processor/sums-8 365.0 ± 0% 260.0 ± 0% -28.77% (p=0.000 n=10) ```  #### Testing This is a refactor, existing tests pass unaltered.  #### Documentation not needed

…metry#36498) #### Description Removes the nested (aka overloading `streams.Map`) implementation. This has been entirely replaced by a leaner, "linear" implementation: - open-telemetry#35048 - open-telemetry#36486  #### Testing Existing tests continue to pass unaltered  #### Documentation not needed

#### Description Finishes work started in open-telemetry#35048 That PR only partially introduced a less complex processor architecture by only using it for Sums. Back then I was not sure of the best way to do it for multiple datatypes, as generics seemed to introduce a lot of complexity regardless of usage. I since then did of a lot of perf analysis and due to the way Go works (see gcshapes), we do not really gain anything at runtime from using generics, given method calls are still dynamic. This implementation uses regular Go interfaces and a good old type switch in the hot path (ConsumeMetrics), which lowers mental complexity quite a lot imo. The value of the new architecture is backed up by the following benchmark: ``` goos: linux goarch: arm64 pkg: github.com/open-telemetry/opentelemetry-collector-contrib/processor/deltatocumulativeprocessor │ sums.nested │ sums.linear │ │ sec/op │ sec/op vs base │ Processor/sums-8 56.35µ ± 1% 39.99µ ± 1% -29.04% (p=0.000 n=10) │ sums.nested │ sums.linear │ │ B/op │ B/op vs base │ Processor/sums-8 11.520Ki ± 0% 3.683Ki ± 0% -68.03% (p=0.000 n=10) │ sums.nested │ sums.linear │ │ allocs/op │ allocs/op vs base │ Processor/sums-8 365.0 ± 0% 260.0 ± 0% -28.77% (p=0.000 n=10) ```  #### Testing This is a refactor, existing tests pass unaltered.  #### Documentation not needed

…metry#36498) #### Description Removes the nested (aka overloading `streams.Map`) implementation. This has been entirely replaced by a leaner, "linear" implementation: - open-telemetry#35048 - open-telemetry#36486  #### Testing Existing tests continue to pass unaltered  #### Documentation not needed

…lemetry#36169) #### Description The max_streams default value was changed in open-telemetry#35048 but it was not updated in the readme.