-
Notifications
You must be signed in to change notification settings - Fork 120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
always end up coredump with rolling #659
Comments
@clalancette do you see this issue? |
Just want to know that this is happening only with my environment or not. |
Other Tier I RMW implementation does not have this problem. root@tomoyafujita:~/ros2_ws/colcon_ws# RMW_IMPLEMENTATION=rmw_cyclonedds_cpp ros2 topic list
/parameter_events
/rosout
root@tomoyafujita:~/ros2_ws/colcon_ws# RMW_IMPLEMENTATION=rmw_connextdds ros2 topic list
/parameter_events
/rosout |
Yes, this is happening on all jobs on CI right now. It just started today, as the overnight CI jobs were OK. |
No i rolled back them, but it does not change the result. i guess something in Fast-DDS, working on core stack trace. |
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./build/demo_nodes_cpp/talker'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 eprosima::fastdds::dds::detail::ReadTakeCommand::generate_info (info=..., instance=..., item=@0x55a0569247d0: 0x0)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/subscriber/DataReaderImpl/ReadTakeCommand.hpp:229
229 info.sample_state = item->isRead ? READ_SAMPLE_STATE : NOT_READ_SAMPLE_STATE;
[Current thread is 1 (Thread 0x7f3971f5a640 (LWP 127542))]
(gdb) bt
#0 eprosima::fastdds::dds::detail::ReadTakeCommand::generate_info (info=..., instance=..., item=@0x55a0569247d0: 0x0)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/subscriber/DataReaderImpl/ReadTakeCommand.hpp:229
#1 0x00007f39788b8a73 in eprosima::fastdds::dds::detail::DataReaderHistory::get_first_untaken_info (this=0x55a056923d20, info=...)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/subscriber/history/DataReaderHistory.cpp:361
#2 0x00007f397889c360 in eprosima::fastdds::dds::DataReaderImpl::get_first_untaken_info (this=0x55a056923860, info=0x7f3971f593f0)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/subscriber/DataReaderImpl.cpp:762
#3 0x00007f3978896993 in eprosima::fastdds::dds::DataReader::get_first_untaken_info (this=0x55a0569223d0, info=0x7f3971f593f0)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/subscriber/DataReader.cpp:278
#4 0x00007f397932b648 in rmw_fastrtps_shared_cpp::__rmw_wait (identifier=0x7f3979400246 "rmw_fastrtps_cpp", subscriptions=0x7f3971f594e0,
guard_conditions=0x7f3971f594f0, services=0x0, clients=0x0, events=0x0, wait_set=0x7f395c000b70, wait_timeout=0x0)
at /root/ros2_ws/colcon_ws/src/ros2/rmw_fastrtps/rmw_fastrtps_shared_cpp/src/rmw_wait.cpp:68
#5 0x00007f39792ce1b7 in node_listener (context=0x55a0567c7260)
at /root/ros2_ws/colcon_ws/src/ros2/rmw_fastrtps/rmw_fastrtps_shared_cpp/src/listener_thread.cpp:142
#6 0x00007f39792cf0ec in std::__invoke_impl<void, void (*)(rmw_context_s*), rmw_context_s*> (
__f=@0x55a056939cd0: 0x7f39792cdf7b <node_listener(rmw_context_t*)>) at /usr/include/c++/11/bits/invoke.h:61
#7 0x00007f39792cf04d in std::__invoke<void (*)(rmw_context_s*), rmw_context_s*> (
__fn=@0x55a056939cd0: 0x7f39792cdf7b <node_listener(rmw_context_t*)>) at /usr/include/c++/11/bits/invoke.h:96
#8 0x00007f39792cefad in std::thread::_Invoker<std::tuple<void (*)(rmw_context_s*), rmw_context_s*> >::_M_invoke<0ul, 1ul> (this=0x55a056939cc8)
at /usr/include/c++/11/bits/std_thread.h:253
#9 0x00007f39792cef62 in std::thread::_Invoker<std::tuple<void (*)(rmw_context_s*), rmw_context_s*> >::operator() (this=0x55a056939cc8)
at /usr/include/c++/11/bits/std_thread.h:260
#10 0x00007f39792cef42 in std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(rmw_context_s*), rmw_context_s*> > >::_M_run (
this=0x55a056939cc0) at /usr/include/c++/11/bits/std_thread.h:211
#11 0x00007f3979a692b3 in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#12 0x00007f39797d7b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#13 0x00007f3979869a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
|
segmentation fault happens in Fast-DDS as described above via rmw_wait. checking Fast-DDS commit history. |
seems to be related to eProsima/Fast-DDS@de5cd9c, checking with rolling back the Fast-DDS master branch commit. there are 3 possible commits from yesterday. https://github.com/eProsima/Fast-DDS/commits/master |
Great, thanks. If we can't get eProsima to fix this today, I'll propose that we update https://github.com/ros2/ros2/blob/rolling/ros2.repos#L37 to pin the version to the last known good hash as a workaround. |
okay confirmed that eProsima/Fast-DDS@de5cd9c is the commit to leads to the coredump. (rolling back to eProsima/Fast-DDS@f6670e3, it does not generate the Segmentation Fault.) |
@fujitatomoya would you please open a PR in Fast DDS that reverts the offending commit? |
revert PR is eProsima/Fast-DDS#3221 |
Stack Trace Information: (gdb) frame
#0 eprosima::fastdds::dds::detail::ReadTakeCommand::generate_info (info=..., instance=..., item=@0x55db75811ad0: 0x0)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/subscriber/DataReaderImpl/ReadTakeCommand.hpp:229
229 info.sample_state = item->isRead ? READ_SAMPLE_STATE : NOT_READ_SAMPLE_STATE;
(gdb) print info
$1 = (eprosima::fastdds::dds::SampleInfo &) @0x7fe8893f43f0: {sample_state = 0, view_state = 0, instance_state = 0, disposed_generation_count = 0,
no_writers_generation_count = 0, sample_rank = 0, generation_rank = 0, absolute_generation_rank = 0, source_timestamp = {seconds_ = 0,
fraction_ = 0, nanosec_ = 0}, reception_timestamp = {seconds_ = 0, fraction_ = 0, nanosec_ = 0}, instance_handle = {value = {value_ = {
_M_elems = '\000' <repeats 15 times>}, has_been_set_ = false}}, publication_handle = {value = {value_ = {
_M_elems = '\000' <repeats 15 times>}, has_been_set_ = false}}, valid_data = false, sample_identity = {writer_guid_ = {guidPrefix = {
static size = 12, value = '\000' <repeats 11 times>}, entityId = {static size = 4, value = "\000\000\000"}}, sequence_number_ = {high = -1,
low = 0}}, related_sample_identity = {writer_guid_ = {guidPrefix = {static size = 12, value = '\000' <repeats 11 times>}, entityId = {
static size = 4, value = "\000\000\000"}}, sequence_number_ = {high = -1, low = 0}}}
(gdb) print item
$2 = (const eprosima::fastdds::dds::detail::DataReaderCacheChange &) @0x55db75811ad0: 0x0
(gdb) whatis item
type = const eprosima::fastdds::dds::detail::DataReaderCacheChange &
(gdb) frame
#1 0x00007fe88fd53a73 in eprosima::fastdds::dds::detail::DataReaderHistory::get_first_untaken_info (this=0x55db75811020, info=...)
at /root/ros2_ws/colcon_ws/src/eProsima/Fast-DDS/src/cpp/fastdds/subscriber/history/DataReaderHistory.cpp:361
361 ReadTakeCommand::generate_info(info, *(it->second), *item);
(gdb) print item
$13 = 0x0
(gdb) print instance_changes
$14 = (eprosima::fastrtps::ResourceLimitedVector<eprosima::fastrtps::rtps::CacheChange_t*, std::integral_constant<bool, true>, eprosima::fastrtps::ResourceLimitedContainerConfig, std::allocator<eprosima::fastrtps::rtps::CacheChange_t*>, std::vector<eprosima::fastrtps::rtps::CacheChange_t*, std::allocator<eprosima::fastrtps::rtps::CacheChange_t*> > > &) @0x55db75811a20: {
_vptr.ResourceLimitedVector = 0x7fe8905cd318 <vtable for eprosima::fastrtps::ResourceLimitedVector<eprosima::fastrtps::rtps::CacheChange_t*, std::integral_constant<bool, true>, eprosima::fastrtps::ResourceLimitedContainerConfig, std::allocator<eprosima::fastrtps::rtps::CacheChange_t*>, std::vector<eprosima::fastrtps::rtps::CacheChange_t*, std::allocator<eprosima::fastrtps::rtps::CacheChange_t*> > >+16>, configuration_ = {initial = 100,
maximum = 5000, increment = 1}, collection_ = std::vector of length 0, capacity 100}
(gdb) print data_available_instances_
$15 = std::map with 1 element = {[{value = {value_ = {_M_elems = '\000' <repeats 15 times>},
has_been_set_ = false}}] = std::shared_ptr<eprosima::fastdds::dds::detail::DataReaderInstance> (use count 2, weak count 0) = {
get() = 0x55db75811a20}} the following patch can fix the problem, but re-commit the fix as complete patch would be the best. diff --git a/src/cpp/fastdds/subscriber/history/DataReaderHistory.cpp b/src/cpp/fastdds/subscriber/history/DataReaderHistory.cpp
index e9d612669..2d258a433 100644
--- a/src/cpp/fastdds/subscriber/history/DataReaderHistory.cpp
+++ b/src/cpp/fastdds/subscriber/history/DataReaderHistory.cpp
@@ -357,9 +357,12 @@ bool DataReaderHistory::get_first_untaken_info(
{
auto it = data_available_instances_.begin();
auto& instance_changes = it->second->cache_changes;
- auto item = instance_changes.cbegin();
- ReadTakeCommand::generate_info(info, *(it->second), *item);
- return true;
+ if (!instance_changes.empty())
+ {
+ auto item = instance_changes.cbegin();
+ ReadTakeCommand::generate_info(info, *(it->second), *item);
+ return true;
+ }
}
return false; |
@clalancette @MiguelCompany now it is all green with rmw_fastrtps full CI, i will go ahead to close this one. |
Bug report
Required Info:
Steps to reproduce issue
root@tomoyafujita:~/ros2_ws/colcon_ws# ros2 topic list Segmentation fault (core dumped)
Or any other command line is issued, that leads to core dump.
Expected behavior
No coredump observed.
The text was updated successfully, but these errors were encountered: