-
Notifications
You must be signed in to change notification settings - Fork 802
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
If loan message is enabled, subscriber will send ack to publisher in take function instead of return loan function, data corruption may happen [13672] #2080
Comments
@crazyhank True, this is how the mechanism has been designed. Even sending the ack when returning the loan may produce data corruption, as the same samples could be loaned more than once, if the This is why is_sample_valid method exists, to allow checking if the data was overwritten after being processed. Some could argue that having a subscriber twice as slow as the publisher means your system is bad designed, but to help supporting a use-case as the one you depict here, we could implement an application level ACK in the future. |
Unfortunately, we are programming based on ROS2 API, it seems that is_sample_valid is not exported in ROS2 APIs. Correct me if I am wrong, or give me some advice about how to deal with this kind of problem now, many thanks! |
@MiguelCompany , in ROS2 galactic release, it seems that data sharing is disabled by default, so this problem will never happen, but we will not get benefit from data sharing. Do you have any comment on this? |
@crazyhank Data sharing is ready to use on ROS 2 rolling. We are backporting it to galactic here, and are still preparing some documentation on how to use it. Regarding your suggestion on having a counter of subscribers, it is not as simple as it may seem. New subscribers could get in just after the sample was written. A subscriber could crash, and then never decrement the counter. The mechanism we have is similar, but it is based on the sequence number of the samples. That said, I think there could be a way to achieve what you pretend, but it may be a bit cumbersome.
|
True, a crash may happen for a subsriber. I think we can add a timeout mechanism, if timeout happened, we can remove this subscriber from current subscriber list and get the payload back. |
@crazyhank The mechanism you mention is already in place with the participant lease duration. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Consider the following case:
Publisher send message to subscribers in 10HZ, but one of subscribers process the received message in 5HZ, in other words, the speed of subcriber is slower than publisher. If message type is plain, then subsriber will use shared memory without copying data. I checked the source code the ack from subscriber is sent during take function, it is ok with no plain data type, but for plain data type, we want to use this shared memory until application's callback is done, otherwise there may be a data corrupt!
The text was updated successfully, but these errors were encountered: