-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rclcpp_lifecycle::LifecycleNode::get_current_state is not thread safe #1746
Comments
This is true. Current states queries from impl, sets to local var and returns it. rclcpp/rclcpp_lifecycle/src/lifecycle_node_interface_impl.hpp Lines 335 to 340 in b918bd4
Solution (if it is safe to query the internal state from multiple threads): Do not set the local var, just return the constructed value. But since a state may change parallel (outside service call proc by some cb) of others threads querying it there probably should be locks probably. |
This is a bug, confirmed that core crash happens on mainline.
for doing this, I think we need to change the API to return the object.
right. I believe that |
ros2/rclcpp#1746 Signed-off-by: Tomoya Fujita <[email protected]>
Seems reasonable to unlock while doing user cb. Requires, pre, cb, post not to be an atomic operation. If that is the case an unlock can be done. Otherwise if no other changing/writing calls from other threads are allowed during the release of pre and reacquiring of cb or post, we can only unlock if other writing API calls recognize an ongoing atomic transaction (due to code in pre). This would just abort any further transaction with ongoing transaction error or maybe a possibility (conditionvar) to wait. The other option is to provide an API impl to the callback that does not need to lock, but this would be a breaking change I guess. |
(Rephrase precisely)
This is still correct but not exactly direct reason for coredump, we can see the following backtrace. Click to expand!
protecting |
address racy condition with #1756
besides, probably we want to do this for user experience? any opinion? |
+1 on this. I also noticed the crash when concurrently invoking a lifecycle node state transition while at the same time the node was checking the current state. |
453bfa8 can resolve this racy condition, confirmed with https://github.com/fujitatomoya/ros2_test_prover/blob/master/prover_rclcpp/src/rclcpp_1746.cpp |
Bug report
Required Info:
Steps to reproduce issue
In code below get_current_state is called in 2 threads and after random number of iterations
Error in state! Internal state_handle is NULL.
is thrown when calling id() state.The problem disappears if only one thread calls get_current_state function.
The text was updated successfully, but these errors were encountered: