Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Histogram Aggregation - core SDK implementation #2581

Closed
wants to merge 18 commits into from
Closed

Histogram Aggregation - core SDK implementation #2581

wants to merge 18 commits into from

Conversation

beanliu
Copy link
Contributor

@beanliu beanliu commented Jan 25, 2021

the original PR #2575 was divided up to 3 PR following the advice from the comments.
they are not fully independent of each other and share some common commits, here's a simple description of their relationships in commits:

------>(add histogram to MetricData)------->(implement histogram aggregator, #2581 )
                                    ------->(OTLP exporter, #2582)
                                    ------->(Prometheus exporter, #2583 )

links of follow up PRs: #2582 #2583


Support fixed bucket histogram aggregation and exporting them with OTLP/Prometheus exporter.

Since the View API is not part of the public Metric API yet (optentelemetry-specification#466), there's no universal way to configure custom aggregators for different instruments. However, the SdkMeterProvider provides a registerView method which allows developers to specify aggregations for certain instruments. This suggests that we could do the following to achieve histogram aggregations:

private static DoubleValueRecorder tryNewHistogramValueRecorder(
    MeterProvider meterProvider, Meter meter, String name, double[] boundaries) {
  if (meterProvider.getClass() == SdkMeterProvider.class) {
    SdkMeterProvider sdkMeterProvider = (SdkMeterProvider) meterProvider;

    sdkMeterProvider.registerView(
        InstrumentSelector.builder()
            .setInstrumentNameRegex(name)
            .setInstrumentType(InstrumentType.VALUE_RECORDER)
            .build(),
        AggregatorFactory.histogram(
            ImmutableDoubleArray.copyOf(boundaries), /* stateful= */ false));
  }

  return meter.doubleValueRecorderBuilder(name).build();
}

// in the application/instrumentation code
DoubleValueRecorder mRequestLatency = tryNewHistogramValueRecorder(meterProvider, meter,
        "example_request_latency", new double[] {50, 200, 300});
DoubleValueRecorder mResponseSize = tryNewHistogramValueRecorder(meterProvider, meter,
    "example_response_size", new double[] {1_000, 5_000, 20_000});

And we could upgrade the example to use View API once they are ready.

@codecov
Copy link

codecov bot commented Jan 25, 2021

Codecov Report

Merging #2581 (7b8cebd) into main (9e1c76b) will increase coverage by 87.12%.
The diff coverage is 91.73%.

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #2581       +/-   ##
===========================================
+ Coverage        0   87.12%   +87.12%     
- Complexity      0     2666     +2666     
===========================================
  Files           0      323      +323     
  Lines           0     8908     +8908     
  Branches        0      904      +904     
===========================================
+ Hits            0     7761     +7761     
- Misses          0      856      +856     
- Partials        0      291      +291     
Impacted Files Coverage Δ Complexity Δ
...entelemetry/exporter/prometheus/MetricAdapter.java 89.61% <0.00%> (ø) 25.00 <0.00> (?)
...telemetry/sdk/extension/otproto/MetricAdapter.java 95.55% <ø> (ø) 28.00 <0.00> (?)
...metrics/aggregator/HistogramAggregatorFactory.java 80.00% <80.00%> (ø) 6.00 <6.00> (?)
.../metrics/aggregator/DoubleHistogramAggregator.java 93.58% <93.58%> (ø) 13.00 <13.00> (?)
...etry/sdk/metrics/aggregator/AggregatorFactory.java 100.00% <100.00%> (ø) 5.00 <1.00> (?)
.../sdk/metrics/aggregator/HistogramAccumulation.java 100.00% <100.00%> (ø) 3.00 <3.00> (?)
...emetry/sdk/metrics/aggregator/MetricDataUtils.java 100.00% <100.00%> (ø) 8.00 <2.00> (?)
...elemetry/sdk/metrics/data/DoubleHistogramData.java 100.00% <100.00%> (ø) 2.00 <2.00> (?)
...try/sdk/metrics/data/DoubleHistogramPointData.java 100.00% <100.00%> (ø) 4.00 <4.00> (?)
.../io/opentelemetry/sdk/metrics/data/MetricData.java 94.59% <100.00%> (ø) 20.00 <3.00> (?)
... and 324 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9e1c76b...34ef27b. Read the comment docs.

Base automatically changed from master to main January 26, 2021 00:26
Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused about the series of PRs sent. Can we start with a minimal PR that just adds the data type, then one PR that adds the aggregator, then PRs to plug it to the exporters

@jkwatson
Copy link
Contributor

I am confused about the series of PRs sent. Can we start with a minimal PR that just adds the data type, then one PR that adds the aggregator, then PRs to plug it to the exporters

heh. this was my request to break up the original into 3 PRs. this is the first one in the series at the moment.

@beanliu
Copy link
Contributor Author

beanliu commented Jan 27, 2021

I am confused about the series of PRs sent. Can we start with a minimal PR that just adds the data type, then one PR that adds the aggregator, then PRs to plug it to the exporters

sorry for the confusion. the original PR #2575 was divided up to 3 PR following the advice from the comments.
they are not fully independent of each other and share some common commits, here's a simple description of their relationships in commits:

------>(add histogram to MetricData)------->(implement histogram aggregator, #2581 )
                                    ------->(OTLP exporter, #2582)
                                    ------->(Prometheus exporter, #2583 )

will update the description to make it more clear. if you think the (add histogram to MetricData) part is worth another separate PR to have a better review process please let me know.

@@ -29,6 +29,8 @@
/* isMonotonic= */ false, AggregationTemporality.CUMULATIVE, Collections.emptyList());
private static final DoubleSummaryData DEFAULT_DOUBLE_SUMMARY_DATA =
DoubleSummaryData.create(Collections.emptyList());
private static final DoubleHistogramData DEFAULT_DOUBLE_HISTOGRAM_DATA =
DoubleHistogramData.create(AggregationTemporality.CUMULATIVE, Collections.emptyList());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we really hard-code CUMULATIVE here? Shouldn't it depend on how the aggregation is configured?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a default & fallback value when we try to get the histogram data from an instance of MetricData whose type is not HISTOGRAM. I would assume something is wrong if this happens.

@jkwatson
Copy link
Contributor

This seems pretty close to me, but I'd like our local metrics expert, @bogdandrutu to review before we merge.

Also, it's worth noting that the OTel metrics data model is about to get a re-think, which could lead to needing to update things in here, depending on the outcome of those discussions.

@beanliu
Copy link
Contributor Author

beanliu commented Jan 29, 2021

This seems pretty close to me, but I'd like our local metrics expert, @bogdandrutu to review before we merge.
Also, it's worth noting that the OTel metrics data model is about to get a re-think, which could lead to needing to update things in here, depending on the outcome of those discussions.

Thanks for the review and all the comments. as of the data model re-thinking, is there any reference that I could follow up?

@jkwatson
Copy link
Contributor

Thanks for the review and all the comments. as of the data model re-thinking, is there any reference that I could follow up?

I think there are meetings being put on the public calendar where this is being discussed. I'd start by looking at recent metrics SIG meeting notes.

@bogdandrutu
Copy link
Member

@beanliu sorry for taking too long to look into this. We just finished the tracing review, and we can now look into metrics again :)

@bogdandrutu
Copy link
Member

@beanliu @jkwatson I do understand that this PR was initially split, but I still see 2 different PRs in this one:

  • One that adds the "data" (changes in the data package)
  • One that adds the "Aggregator" (changes in the aggregator package)

@beanliu
Copy link
Contributor Author

beanliu commented Feb 23, 2021

but I still see 2 different PRs in this one: ...

@bogdandrutu thanks for the comments. not sure if I understand correctly, given the review is in progress, are u suggesting splitting this PR again?

@bogdandrutu
Copy link
Member

Unfortunately I have some time sensitive tasks at work for the next 2 days. I will come back to you asap.

Regarding splitting. I know I review but was a very high level review to be honest because there are a lot of changes in this PR. But I think if you were to split probably would be way faster to merge this.

@beanliu
Copy link
Contributor Author

beanliu commented Feb 24, 2021

this PR is split into #2923 (changes in the data package) and #2924 (changes in the aggregator package). will create follow-up PR to enable histogram exporters once the prerequisite PRs got merged. closing this off.

@beanliu beanliu closed this Feb 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants