-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decouple Number Type from Point Kind in OTLP protocol #264
Comments
This would require at least one data point to arrive in order for exporters to create metric descriptors in the backend. What are the use cases where the current data model is not working for you? |
I've proposed similar, #172, but at this point I'm happy with what we have. |
@victlu the problem with this approach, is that if there are systems that consider the data point kind as part of the identity for a metric will be impossible to have that. For example stackdriver in GCP does that, they consider an int sum different than and double sum, and they need to know if the metric will have one or the other. |
@bogdandrutu, In your case, I would think it serves them better to explicitly name their metrics accordingly (as they want two time series) rather than by the data type. I think this question is related to #1366. How do we consider the "name" to be unique in identifying a time series. |
@rakyll Why can't metric descriptors be created (without data type)? Or are you saying a descriptor MUST include the data type? In which case, that's the argument I'm making here... IMHO, metric backend systems ultimately shows/graphs/alerts/store/etc... numeric based Time-Series. And for performance and efficiency, they want to use the most efficient representation of the "number". Thus, promotion of data types are common especially for storage considerations. The important thing is that it's just a number regardless if its transmitted as a byte, short, ushort, int, uint, long, ulong, float, double, etc... |
If the type is included in the metric descriptor/name what is the benefit of doing what you suggested? Also your suggestion will add an extra check possible that you need to verify the type in the descriptor matches the type in the points. Also on top of everything that adds extra allocations on serialization/deserialization from proto bytes. |
@victlu Some backends like Stackdriver see the type as an integral part of the metric. See the value_type at https://cloud.google.com/monitoring/api/ref_v3/rpc/google.api#google.api.MetricDescriptor. This model gives us an easy/cost effective way of identifying the data type in metric creation. These backends will reject if send a floating number to an integer metric, and it'd require validation of the data points in the exporter as @bogdandrutu suggests. |
I am trying to understand, but I have a few questions... From last few responses, it seems we are specializing OTLP protocol based on specific vendor's backend implementation? I assume we would do this for performance? We should call out what backend system we are optimizing for in this case. What happens if a vendor does not need to separate ints/doubles, then it would need to merge ints and doubles together? That does not seem performant to me either. I would think the Spec is more "higher-level" to convey the meaning and semantics of metrics. The data type of data points seems more like details. Looking at a few existing client APIs (i.e. Prometheus, micrometer) , I don't see Counters/Gauge/Histogram/etc... decorated with a data type (i.e. IntCounter/DoubleCounter/etc...). Or they just take double to cover both int/double. What should we do here? How will this work if I'm using a type-less language (i.e. javascript)? I assume I will need to separate out ints/doubles/etc... into different metrics? I'm also not suggesting every data point needs to be type-less. I just don't see the data-type being at the topmost level with semantics like Sum/Gauge/Histogram/etc... Maybe we have oneof IntTimeSeries/DoubleTimeSeries under the oneof Sum/Gauge/Histogram/etc. |
I propose we change to following...
|
@victlu I kind of like your proposal, but I would like to understand how do you see this as fixing an issue? I think this is a nice cosmetic improvement, but I am not clear if this solves a real issue or we see this as just an cosmetic improvement :) Update: |
@jmacd what do you think about this improvement in readability #264 (comment) ? |
The change I'm trying to affect is to decouple the semantic of an instrument (i.e. Counter vs Gauge vs Histogram) from the details of the data type (bytes/int/long/double/float/etc). I think it may have these beneficial consequences.
|
Separately, I think we should group by Labels first before we have DataPoints to minimize on buffer size. I'll file a separate issue for this. |
I like it, but it raises the question you asked.
That's a tough one. We discussed in yesterday's data model meeting that SDKs should not generate conflicting types (loosely defined), and we discussed that we would like the collector to pass-through conflicting types (again loosely defined) as if they were separate metrics. This appearance of both double and integer repeated fields makes it possible to have a single Metric that looks "not separate", in the sense that it contains multiple number types. I think if we allow what @victlu proposes, we should firmly state that OTel welcomes you to mix integer and double metrics--that this is not a semantic conflict--just that we would like you not to do so in the same SDK. Naturally we could use a Even when we allow mixing repeated integer- with repeated double-values, it will be difficult to work with this mixture of values without combining and interleaving them (sort by time) into an intermediate representation, which probably brings up the problem discussed next.
Yes, let's keep that separate. I recall a historical conversation, but would prefer not to dig up old threads. @bogdandrutu do have thoughts? I believe we were avoiding additional message layers where possible. The best argument in favor of the current design that I remember is that an SDK should never output more than one point per metric and label set per collection period. So, the extra layer for label set will always have 1 entry in the encoding @victlu proposes, under a simple collection model. I haven't thought through the implications of this in the collector. I think the greater win would be to move labels and label sets into separate regions in the Export request, with a more-compact encoding. |
@victlu When you first mentioned this I was skeptical, but I do like the simplicity proposed here. My only concern right now is the amount of churn this could cause right now. Ideally this should be done via a "deprecation" and "transition" period, as we DO have implementations that folks are trying to use with metrics. Also to @jmacd and @bogdandrutu IIUC "oneof" is slow in implementation in Go, right? That's not a fundamental limitation in protocol buffers AFAIK, it's basically just the same as "optional" but with some runtime checks. |
Like @jsuereth I'm worried about churn. I'd like to connect the question posed here, about reducing the number of types inside the On the one hand, we can have just the four semantically distinct types (Counter, Gauge, Histogram, Summary) and push all their variation into the types themselves. This is Victor's proposal for Gauge/Sum, and this is my proposal for Histogram in #272. On the other hand, we can go with a distinct type for each variation in the |
Agree w/ @jmacd on consistency between histogram + metric types. If we go with #272 then we should also take this proposal/direction (albeit I'd like to see a smooth transition accompany the propsoal). @victlu If you want to take a crack at what migration from existing OTLP to this proposal looks like, that'd be good. This will disrupt a lot of the ecosystem in OTel and I know there's already some users of metrics where such a large change could break them if it's not done smoothly. I realize metrics isn't marked "stable" but users are users. If you want help thinking through a smooth-transition plan, I'm happy to take time to brainstorm with you. |
I'll take a crack at it. @jsuereth I'll reach out on slack as I am sure I'm not up-to-speed on all the intricacies of this task. |
I'd agree with the approach similar to #272 add the new form with simplistic names. Edit: |
Proposal for smooth transition for this (and other) protocol changes. Migrating OTLP protocolStrategyTo deliver OTLP protocol changes, the following proposed steps should be followed
For issue #264Mark old messages as "depricated" and add new messages... oneof data {
// DEPRICATED BY 4/15/2021
IntGauge int_gauge = 4;
DoubleGauge double_gauge = 5;
IntSum int_sum = 6;
DoubleSum double_sum = 7;
IntHistogram int_histogram = 8;
DoubleHistogram double_histogram = 9;
DoubleSummary double_summary = 11;
// New messages
Gauge guage = 12;
Sum sum = 13;
Summary summary = 14;
Histogram histogram = 15;
} Add new messages... message Gauge {
repeated IntDataPoint int_data_points = 1;
repeated DoubleDataPoint double_data_points = 2;
}
message Sum {
// aggregation_temporality describes if the aggregator reports delta changes
// since last report time, or cumulative changes since a fixed start time.
AggregationTemporality aggregation_temporality = 2;
// If "true" means that the sum is monotonic.
bool is_monotonic = 3;
repeated IntDataPoint int_data_points = 4;
repeated DoubleDataPoint double_data_points = 5;
}
message Histogram {
// aggregation_temporality describes if the aggregator reports delta changes
// since last report time, or cumulative changes since a fixed start time.
AggregationTemporality aggregation_temporality = 2;
repeated IntDataPoint int_data_points = 3;
repeated DoubleDataPoint double_data_points = 4;
}
message Summary {
repeated DoubleDataPoint double_data_points = 1;
} |
@victlu I think the transition is a bit more dramatic here. Remember that OTLP is an in-flight protocol and deprecations have a slow-rollout between producers/consumers, so you need to call that out in the transition. I.e. what you define may be a little too aggressive/churny for end users. Here's an abridged update to what you had. I think maybe this deserves a full OTEP, but for the purposes of your change we just need to agree to the general outline. Migrating OTLP protocolStrategyTo deliver OTLP protocol changes, the following proposed steps should be followed
|
Please review #278. |
Let's leave this open until the PR merges 😀 |
The OTLP message Metric has the following definition for instrument types...
spec
I propose we only define Instrument Types (i.e. Gauge, Sum, etc...) at the message Metric level.
The data type (i.e. int, double, etc...) should be lower level, maybe at the message DataPoint level.
i.e.
The text was updated successfully, but these errors were encountered: