-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RTC-93] Add some RTP metrics #142
Conversation
lib/membrane/rtp/metrics.ex
Outdated
"inbound-rtp.ssrc", | ||
event_name: [Membrane.RTP, :inbound_track, :new], | ||
measurement: :ssrc | ||
), | ||
Metrics.counter( | ||
"inbound-rtp.nack", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be sure, I'd include _sent
in the name to leave no questions
"inbound-rtp.nack", | |
"inbound-rtp.nack_sent", |
lib/membrane/rtcp/parser.ex
Outdated
@@ -58,6 +80,14 @@ defmodule Membrane.RTCP.Parser do | |||
@impl true | |||
def handle_event(:output, %RTCPEvent{} = event, _ctx, state) do | |||
buffer = %Buffer{payload: RTCP.Packet.serialize(event.rtcp)} | |||
|
|||
Membrane.TelemetryMetrics.execute( | |||
@rtcp_sent_telemetry_event, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a bit problematic, as it includes FIRs, PLIs, NACKs, REMB (when/if implemented) etc in addition to receiver reports.
- Do you thing such aggregated statistic is helpful?
- I guess the number of SRs can be estimated by substraction
RTCP - FIR - NACK
, but maybe we should have a separate metric for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I think this comment answers it pretty well: [RTC-93] Add some RTP metrics #142 (comment)
@@ -65,6 +72,8 @@ defmodule Membrane.RTP.OutboundTrackingSerializer do | |||
|
|||
@impl true | |||
def handle_init(_ctx, options) do | |||
Membrane.TelemetryMetrics.register(@frame_sent_telemetry_event, options.telemetry_label) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are ignoring RTCP packets sent by this element, a metric for that is missing
lib/membrane/rtp/metrics.ex
Outdated
Metrics.counter( | ||
"inbound-rtp.nack", | ||
event_name: [Membrane.RTP, :rtcp, :nack, :sent] | ||
), | ||
Metrics.counter( | ||
"inbound-rtcp.packets", | ||
event_name: [Membrane.RTP, :rtcp, :arrival] | ||
), | ||
Metrics.sum( | ||
"inbound-rtcp.bytes_received", | ||
event_name: [Membrane.RTP, :rtcp, :arrival], | ||
measurement: :bytes | ||
), | ||
Metrics.counter( | ||
"outbound-rtcp.packets", | ||
event_name: [Membrane.RTP, :rtcp, :sent] | ||
), | ||
Metrics.sum( | ||
"outbound-rtcp.bytes_sent", | ||
event_name: [Membrane.RTP, :rtcp, :sent], | ||
measurement: :bytes | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The metrics you've added are on a receiver (inbound) end, so I would rename them, as they might be confused with RTCP sent as sender feedback (mostly in OutboundTrackingSerializer for now). I think the pattern should rather be inbound-rtp.rtcp_sent
or inbound.rtcp_sent
. This resembles stats in wertc-internals
where such info is aggregated under inbount-rtp
and remote-inbound-rtp
(we may also follow that convention).
+ label RTCP by the track type (outbound/inbound) instead of RTCP direction + Add sender report sent counter metric
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please upload some example report with new metrics?
event_name: [Membrane.RTP, :packet, :arrival] | ||
), | ||
Telemetry.Metrics.sum( | ||
Metrics.sum( | ||
"inbound-rtp.bytes_received", | ||
event_name: [Membrane.RTP, :packet, :arrival], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be consistent with other events I would change this one to Membrane.RTP, :inbound, :packet, :arrival
"inbound-rtp.bytes_received", | ||
event_name: [Membrane.RTP, :packet, :arrival], | ||
measurement: :bytes | ||
), | ||
Telemetry.Metrics.last_value( | ||
Metrics.last_value( | ||
"inbound-rtp.encoding", | ||
event_name: [Membrane.RTP, :inbound_track, :new], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about [Membrane.RTP, :inbound, :new]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seemed a little weird to me as well, but I have no idea what this will break.
Marker doesn't necessarily indicate a frame boundary. The fact that the frame could be damaged but we would still count it would also cause problems. `markers` is therefore more explicit about what this metric represents.
After some discussions, I ended up completely reworking all RTCP metrics. They are now grouped per track and they include: NACK sent/received, PLI received (I don't think we're sending any), FIR sent/received, Receiver Reports sent/received and Sender Reports sent/received. I also had to make a choice of where to count the specific feedback messages and I decided to count them wherever they are being handled. Eg, FIR is counted in OutboundTrackingSerializer which also sends a Keyframe Request upstream as a result of receiving FIR. The other option was to count them in one central location, which is where the overall counter is incremented, but it occurred to me that if they are counted wherever they are handled, the metric would indicate that we're loosing the packet somewhere, eg, if we have 100 NACK sent by the browser and 0 received. Example report generated from this version:
Also, making note of some bugs:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The report starts looking really nice! Also, great work with RTCP metrics. Now we get so much useful info 👀
A couple of notes:
- Looking at RTC-93 some metrics are still missing:
- RTX has been merged so would be nice to add metric about number of retransmitted packets
- outgoing RTP packets
- outgoing RTP bytes
- incoming RTCP bytes
- outgoing RTCP bytes
- incoming padding packets
- when some event doesn't occur its metric is not present in the report at all. Would be nice to initilize metrics with default values
- maybe instead of identifying metrics as inbound or outbound we should identify track as inbound or outbound and keep metric name generic when possible. This way we would have
{:track_id, id} => {
type: :outbound,
ssrc: 1234
packets: 1000,
fir: 100
}
We might consider getting rid of RTCP bytes as we have detailed numbers about specific RTCP messages but I belive that such a metric can still be useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more concern. Currently total RTCP packets are counted in multiple places e.g. in SSRCRouter and OutboundTrackingSerializer. Both processes register as owners of the same event and I am not sure if that's correct.
If that's correct, then I think we should go further and increment total RTCP packets in every place that some specific RTCP messag is sent or received.
With current RTP plugin complexity it's very easy either not to count in some metric in total statistics or count it in two or more times
This metric may fall victim to the architecture of the RTP Plugin. This value is known in RTCP Parser, but it has no way of knowing to which track this particular RTCP belongs. SSRC Router can know that information, but it receives the RTCP in a parsed form, so it doesn't know the size 🤷 For symmetry, I didn't include outgoing RTCP bytes
I agree, but this is how the library works, but that's the wrong PR, this needs to be done in
I was thinking about that as well, but I came to a conclusion that this is outside the scope of this Pull Request |
It's fine, we do that in a lot of places. We can only have one owner process of the |
Ok, then let's do those two things (metric default value and some renamings) in different PRs. Could you please continue this topic? |
28defd2
to
9c66d57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about other metrics I mentiond in the previous CR?
7f693ce
to
706a5ab
Compare
links_generator = fn twcc? -> | ||
links_generator = fn twcc?, ctx -> | ||
ssrc_router_pad_options = [ | ||
encoding: :rtx, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that needed? It looks like SSRC router doesn't make any use of encoding
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does. Only to emit new_inbound_track_event
but it does
lib/membrane/rtp/ssrc_router.ex
Outdated
if packet_size == 254, | ||
do: Membrane.TelemetryMetrics.execute(@padding_packet_arrival_event, %{}, %{}, label) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC this is not the correct way to detect padding packets, it will only work for encrypted packets, regular RTP will have the padding stripped
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it will work a little too well 🤦 It will also count non-padding packets that happen to have that exact size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should drop this metric for now and get this Pull Request through without it.
Given that the size is not enough to determine if the packet is a padding packet AND the padding also seems to be encrypted, we need something that lives after the decryptor to detect padding packets.
I'm also not convinced it would be a very useful metric on its own in the RTP plugin outside of WebRTC, so I'm considering implementing it in the RTC Engine. Buffers that pass through RTC Engine are decrypted and they have their headers removed, so it should be a simple matter of byte_size(packet) == 0 || packet == <<0::binary-size(254), 255::8>>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not convinced it would be a very useful metric on its own in the RTP plugin outside of WebRTC, so I'm considering implementing it in the RTC Engine.
This sounds like a good idea, padding packets seem specific to WebRTC, but probably not possible without refactoring - after inbound RTX has been added, the padding packets are sent with retransmission SSRC and RTXParser drops them (they cannot be parsed as they do not contain a payload with the original sequence number)
Co-authored-by: Bartosz Błaszków <[email protected]>
lib/membrane/rtcp/receiver.ex
Outdated
%{}, | ||
state.telemetry_label | ||
) | ||
emit_sender_report_sent_telemetry_event(state) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This function name is invalid (it sends
@sender_report_received_telemetry_event
) - I rather had this in mind:
emit_sender_report_sent_telemetry_event(state) | |
emit_telemetry_event(@sender_report_received, state) |
This way the private function could extract the common part. Now you still have duplication, although grouped. However, I won't insist on that change
This PR adds a few of the metrics described in RTC-93: