-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ROS2 image topic publisher freezes when a subscriber is initiated at a remote PC #460
Comments
Hi @atb033, I have tried to reproduce your issue following your steps without success. The following configurations were used:
Could you please provide us some more information so we can continue to study your issue? |
Hi @JLBuenoLopez-eProsima, One point that I forgot to mention here is that my host PC is actually running natively on Ubuntu 16.04. The |
@JLBuenoLopez-eProsima i can reproduce this issue with,
|
Thanks @fujitatomoya, @atb033 I will keep trying. Could you please provide me with the OS that are running on both machines. Yesterday I tried with:
I am installing a Virtual Machine with Ubuntu16.04 to run the docker (following the last comment from @atb033) to try again but I am unsure about the Raspi4 setup that you are using. Thanks for your help! |
I use Ubuntu20.04 container(--net=host) on Ubuntu20.04 physical env w/o Vitrual Machines for both remote and host PC. |
Thanks again for the information, @fujitatomoya! I have tried that same configuration with foxy binaries and I have been unable to reproduce the issue. I am trying with sources just in case. Could you please provide a network traffic capture of the RTPS packages at both ends so we can try to analyze what is happening? Thanks again! |
since this environment is secured, i am afraid to do this, sorry. @atb033 how about you? |
@JLBuenoLopez-eProsima The following is my setup:
I was able to replicate the bug with this setup again, and I am attaching the network traffic capture here: |
Thanks, @atb033, for sending us the traffic capture! It seems that the net is being overloaded and the reason is the ICMP Destination unreachable (Port unreachable) package that is received after the DataWriter is matched with the DataReader and the images start to be sent. This net overload also seems to be causing the write operation from the DataWriter to enter a deadlock, following the description of your issue. This could be explained as follows: by default ROS 2 configures the DataWriter publish mode as ASYNCHRONOUS as explained here. Consequently, the asynchronous thread will wait until the write operation is finished. On the other hand, the sending buffer could be completely filled and the write operation is probably waiting for the buffer to be free to write the new data. Therefore, could you try first to set the DataWriter as SYNCHRONOUS and tell us if this is enough to fix your issue? Not having an asynchronous thread that could be deadlocked with the write operation, the DataWriter should not stop publishing even though the net is overloaded. If this is not enough, we advise you to set the Finally, you may consider decreasing the Please, let us know if this is enough to solve your issue. |
@JLBuenoLopez-eProsima Thanks for the input. I can't implement these immediately as I am caught up with some other work at this moment. I'll get back to you soon after testing these out. |
Hey @JLBuenoLopez-eProsima I tested all the approaches that you recommended and still the problem persists. The following are the settings that I used. Can you please go through them and tell me if I had done it correctly?
<!-- SYNCHRONOUS mode -->
<dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
<profiles>
<publisher profile_name="publisher profile" is_default_profile="true">
<qos>
<publishMode>
<kind>SYNCHRONOUS</kind>
</publishMode>
</qos>
<historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
</publisher>
<subscriber profile_name="subscriber profile" is_default_profile="true">
<historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
</subscriber>
</profiles>
</dds>
<!-- non-blocking true mode -->
<dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
<profiles>
<transport_descriptors>
<transport_descriptor>
<transport_id>test</transport_id>
<type>UDPv4</type>
<non_blocking_send>true</non_blocking_send>
</transport_descriptor>
</transport_descriptors>
<publisher profile_name="publisher profile" is_default_profile="true">
<historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
</publisher>
<subscriber profile_name="subscriber profile" is_default_profile="true">
<historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
</subscriber>
</profiles>
</dds>
<!-- Reduce maxMessageSize -->
<dds xmlns="http://www.eprosima.com/XMLSchemas/fastRTPS_Profiles">
<profiles>
<transport_descriptors>
<transport_descriptor>
<transport_id>test</transport_id>
<type>UDPv4</type>
<maxMessageSize > 5500 </maxMessageSize>
</transport_descriptor>
</transport_descriptors>
<publisher profile_name="publisher profile" is_default_profile="true">
<historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
</publisher>
<subscriber profile_name="subscriber profile" is_default_profile="true">
<historyMemoryPolicy>PREALLOCATED_WITH_REALLOC</historyMemoryPolicy>
</subscriber>
</profiles>
</dds> |
Thanks @atb033, First, I would like to know if you set the environment variable Therefore, be sure that you run the following commands:
NOTE: On the other hand, even though you are setting the options for a new transport, you are not linking this custom transport to your participant. You can use the following XML file were all three options suggested have been included:
I hope that this solves your issue. Finally, if you do not mind, it would be helpful if you can try each option sequentially and tell us if it is enough to solve the issue:
This will provide us more information about your issue, as we have been unable to reproduce it. |
it's been 3 years so that i checked if this is still reproducible with current container images Test Platform
ResultSubscription running on Raspi4 keeps receiving image data, and publisher does not freeze when the subscriber is initiated. Console Outputroot@tomoyafujita:/# ros2 run image_tools cam2image --ros-args -p burger_mode:=true -p frequency:=10. -p reliability:=best_effort
...<snip>
[INFO] [1694125748.789014474] [cam2image]: Publishing image #3647
[INFO] [1694125748.889001996] [cam2image]: Publishing image #3648
[INFO] [1694125748.989015319] [cam2image]: Publishing image #3649
[INFO] [1694125749.089007292] [cam2image]: Publishing image #3650
[INFO] [1694125749.189011144] [cam2image]: Publishing image #3651
^C[INFO] [1694125749.261280833] [rclcpp]: signal_handler(signum=2)
root@raspi4-1:/# ros2 run image_tools showimage --ros-args -p show_image:=false -p reliability:=best_effort
...<snip>
[INFO] [1694125747.596249367] [showimage]: Received image #camera_frame
Received image #camera_frame
[INFO] [1694125747.696080180] [showimage]: Received image #camera_frame
Received image #camera_frame
[INFO] [1694125747.795356557] [showimage]: Received image #camera_frame
Received image #camera_frame
[INFO] [1694125747.896058764] [showimage]: Received image #camera_frame
Received image #camera_frame
[INFO] [1694125747.996081833] [showimage]: Received image #camera_frame
Received image #camera_frame |
It has been quite a while, that's right, but unfortunately the issue persists. Publishing (ROS2 Foxy/Iron same thing) Just creating a subscription on another PC (Matlab 2024a - ROS Humble) located in the same local network does in fact slow down publication rate which can be again verified using the same two methods, frequency drops to about 1.5Hz and there is significant smoothness drop. Deleting subscriber restores smoothness and original frequency. QoS reliability used in both publisher and subscriber is set to best effort, however I also checked another option where publisher is configured using rclcpp::SensorQoS, where queue size could be set to 1 or 100 on both ends without any effect. |
Bug report
My aim was to send a video stream as a ros-topic to a remote PC (Raspberry Pi 4). To do this, I published an image stream using the
cam2image
node of image_tools packageAt the remote PC (which was connected using WiFi), when I initiated the subscriber using
showimage
node of the same package, thecam2image
node froze as shown in this video.This issue was posted here before and from there I learned that this behavior doesn't occur with
cyclonedds
. Hence posting this here.Required Info:
ros:foxy-ros-base
)Steps to reproduce issue
Install
image_tools
in both host and remote PCIn the host PC
In the remote PC
Expected behavior
The host PC publishes at 10 Hz, and the remote PC subscribes at the same frequency.
Actual behavior
The publisher node freezes when the subscriber is initiated.
Additional information
Also, the publisher node continues to operate normally once the subscriber node is killed
The text was updated successfully, but these errors were encountered: