End-to-End Latency Analysis per Message #34

0xB4demeister · 2025-01-27T16:59:27Z

Hey,
First I wanted to say thanks for your nice tracing framework and analysis library.
I am currently adjusting it a bit to evaluate the impact of hardware acceleration in ROS 2.
I wanted to evaluate memory, cpu usages,.. but also ros-related overhead.
As far as I can tell, your library provides the callback duration out of the box, which is most likely the time RCLCPP_HANDLE takes to handle a callback (using the term from this paper Latency Analysis of ROS2 Multi-Node Systems - https://arxiv.org/pdf/2101.02074)

The callback is a nice way to inspect the execution time of a kernel, but I wish to further investigate the latency per topic and end-to-end latency of the system to approximate the "communication overhead".
(my communication system is rather trivial and is a basically a chain of publications/callbacks with only minor topics being consumed even multiple times)

My question was if the current data structure already supports the linking of publish_handle and subscription_handle based on a topic such that once can unambiguously link a published message start time to a taken message start time.
The second link I wanted to get by connecting a subscription to a callback which gives me a rough time when the publication should start.
This second link I have also still not fully figured out, as I cannot get something other than None being returned from get_subscription_reference_info .

Could you perhaps give me some insights into the data structure and if I need to create/append further information to the traces or simply expose further information from the internal data structures?

The text was updated successfully, but these errors were encountered:

christophebedard · 2025-01-29T02:15:07Z

Hey,
First I wanted to say thanks for your nice tracing framework and analysis library.

Happy to hear you like it!

As far as I can tell, your library provides the callback duration out of the box, which is most likely the time RCLCPP_HANDLE takes to handle a callback (using the term from this paper Latency Analysis of ROS2 Multi-Node Systems - https://arxiv.org/pdf/2101.02074)

Yeah, the callback duration is simply the time difference between just before the user callback is called and just after it returns. It's not quite straightforward looking at the code, but it's the difference between the ros2:callback_start and ros2:callback_end tracepoints: https://github.com/ros2/rclcpp/blob/e7afaa636a97dc3908d25d367652b58ecda95273/rclcpp/include/rclcpp/any_subscription_callback.hpp#L472-L563.

My question was if the current data structure already supports the linking of publish_handle and subscription_handle based on a topic such that once can unambiguously link a published message start time to a taken message start time.

This is indeed possible using the tracing instrumentation. See Section IV-A (Transport Links) in this paper: https://arxiv.org/pdf/2204.10208.

The second link I wanted to get by connecting a subscription to a callback which gives me a rough time when the publication should start.
This second link I have also still not fully figured out, as I cannot get something other than None being returned from get_subscription_reference_info .

You're talking about linking a message being received (and handled by a callback) to a message being published by that callback? See Section IV-B (Causal Message Links), specifically the "direct case," in the paper I linked to above.

All of this allows you to get the end-to-end latency of a processing pipeline and visualize it. I'd recommend reading the full paper! This is all implemented in the Eclipse Trace Compass Incubator (not in tracetools_analysis), see these instructions: https://github.com/christophebedard/ros2-message-flow-analysis#analysis.

Using this information, you can split the end-to-end latency time into ROS 2 internal latency, network latency, etc. However, it could be broken down further (see Section VIII in the paper, for example). One latency component that's especially interesting and which I haven't looked into yet is the delay between the message reception inside the middleware and the start of the callback execution, because this depends on the executor. The executor of course has a big influence of the performance of a ROS 2 application.

Let me know if you have any questions.

0xB4demeister · 2025-01-29T09:40:21Z

Hi, thanks for the extensive explanations!

Yeah, the callback duration is simply the time difference between just before the user callback is called and just after it returns. It's not quite straightforward looking at the code, but it's the difference between the ros2:callback_start and ros2:callback_end tracepoints: https://github.com/ros2/rclcpp/blob/e7afaa636a97dc3908d25d367652b58ecda95273/rclcpp/include/rclcpp/any_subscription_callback.hpp#L472-L563.

Ah great, yeah I was lacking a bit the information about what each of the attributes inside the python library really represent.
I was also wondering a bit about the rmw_handle's and subcription_handle's as some tables I was able to connect but others not. Why is the subscription_id (index) of the callback objects not related to the subscription id of take_instances['rmw_subscription_handle']? I was able to relate callback objects to take_instances via the node_id but it felt a bit strange..
As I want to measure basically the same as Victor in the HAWG,
I want to collect as accurate as possible with minor effort the overhead of ROS 2 and relate that to the execution time of the kernels.
For this, my current plan was to combine the (take_timestamp - callback_start) + message transfer duration (DDS) + (callback_end - Publish_timestamp) as approximation - do you think this works?

Yes, I saw this paper as well and just recognized that it explains what the gid and source_timestamp represent.
But I think I will try for now to implement a "hacky" solution simply based on matching source_timestamp to callback_end and topics as my frequency will not exceed 30Hz because I would need to patch otherwise the DDS layers FastDDS and CyclonDDS myself, right? or was this upstreamed in humble?

All of this allows you to get the end-to-end latency of a processing pipeline and visualize it. I'd recommend reading the full paper! This is all implemented in the Eclipse Trace Compass Incubator (not in tracetools_analysis), see these instructions: https://github.com/christophebedard/ros2-message-flow-analysis#analysis.

I tried also the tool but had a bit problems understanding what I actually see 😁 As I was unable to find instructions/explanations about the different views and how to tune it (e.g. which exact timestamps are visualized, how to visualize callback duration,..) and had the feeling using this library might be better in the long run, especially when I want to add custom trace points for the kernel execution time

christophebedard added the question Further information is requested label Jan 29, 2025

christophebedard self-assigned this Jan 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

End-to-End Latency Analysis per Message #34

End-to-End Latency Analysis per Message #34

0xB4demeister commented Jan 27, 2025

christophebedard commented Jan 29, 2025 •

edited

Loading

0xB4demeister commented Jan 29, 2025

End-to-End Latency Analysis per Message #34

End-to-End Latency Analysis per Message #34

Comments

0xB4demeister commented Jan 27, 2025

christophebedard commented Jan 29, 2025 • edited Loading

0xB4demeister commented Jan 29, 2025

christophebedard commented Jan 29, 2025 •

edited

Loading