-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] Observing data can crash the Technic Hub #1454
Comments
The reason I'm not entirely sure about this, is because unlike past endurance tests, I now also have constantly changing data, such as a value from 0 to 1000, which is sometimes packed as one byte and sometimes as two. So it would be good to try and reproduce with just either the older build with variable data size, or the newer build with constant data size, just to be sure. |
Here is a simpler way to crash the observer. Sender must also be Technic Hub. The observing Technic Hub does not crash if the sender is a Prime Hub, suggesting that sending Technic Hub is sending bad data, though it would be nice if that didn't crash the receiver. A motor is added so we can send the angle to set the hue, as a visual cue that something goes over the air. from pybricks.hubs import TechnicHub
from pybricks.parameters import Color
from pybricks.tools import wait
# Set up all devices.
hub = TechnicHub(observe_channels=[78])
# The main program starts here.
while True:
drive, steer, rotate = hub.ble.observe(78) or [0] * 3
hub.light.on(Color(rotate))
wait(10) from pybricks.hubs import TechnicHub
from pybricks.parameters import Port
from pybricks.pupdevices import Motor
from pybricks.tools import wait
from urandom import randint
# Set up all devices.
hub = TechnicHub(broadcast_channel=78)
motor = Motor(Port.B)
# The main program starts here.
while True:
a = randint(-1000, 1000)
b = randint(-1000, 1000)
c = motor.angle()
hub.ble.broadcast([a, b, c])
# wait(100) # observer does not crash
# wait(10) # observer crashes fairly soon
wait(1) # observer crashes almost instantly |
The observer does seem to error gracefully in these cases, not reboot. It's just more likely to happen when disconnected, because then observing is faster. We get errors such as:
Which seems to confirm the bad data theory. A SPIKE observer will also crash the same way, which further supports the bad Technic Hub sender idea. |
Sometimes you get bad data which, depending on how it is used, does not lead to a crash:
|
It does indeed seem to correlate with data variability: With this as the sender, the observer does not crash: from pybricks.hubs import TechnicHub
from pybricks.parameters import Port, Color
from pybricks.pupdevices import Motor
from pybricks.tools import wait
from urandom import randint
# Set up all devices.
hub = TechnicHub(broadcast_channel=78)
motor = Motor(Port.B)
# The main program starts here.
while True:
a = randint(0, 127)
b = randint(0, 127)
c = motor.angle()
hub.light.on(Color(c))
hub.ble.broadcast([a, b, c])
wait(1) # no crash Changing the randints in the sender to: a = randint(0, 511)
b = randint(0, 511) crashes the observer right away. |
So, fast data updates in a tight loop seem fine, as long as the message format stays the same. If the size changes, the data gets corrupted. Hmm. |
Further simplifying sender to make the raw data trivial to read. from pybricks.hubs import ThisHub
from pybricks.parameters import Port, Color
from pybricks.pupdevices import Motor
from pybricks.tools import wait
from urandom import randint
# Set up all devices.
hub = ThisHub(broadcast_channel=78)
motor = Motor(Port.B)
# The main program starts here.
while True:
a = bytes([i for i in range(randint(4,5))])
hub.ble.broadcast(["ABC", a])
wait(1) If we log just before sending, in
Observer: from pybricks.hubs import ThisHub
from pybricks.parameters import Color
from pybricks.tools import wait
# Set up all devices.
hub = ThisHub(observe_channels=[78])
# The main program starts here.
while True:
data = hub.ble.observe(78)
print(data)
wait(10) Receiving on Prime Hub. Which is now mixed up. It only crashes near the end because it doesn't consider everything invalid on unpacking, but there are multiple things wrong here.
|
@dlech - Bug in the CC2640 or am I missing something here? It's also possible that mixups happen all the time, but some stale data is just not crashing on unpack. (But could give bad results for any value > 1 byte if only a few bytes are stale). |
So yes, if we randomly transmit either of
Then:
|
Could we be overriding the chip's advertising buffer while it is reading it? How might we prevent that? |
We fixed something similar already in pybricks/pybricks-micropython@462c1a6. |
Did you use a sniffer to first verify that the data going over the air is not already corrupt? |
Yes, I'm thinking the problem is in the sender. Because: spike and Technic both have this problem when observing, but only if technic is the sender. But as logged above, the stuff we write to the hci still looks fine. Haven't used a sniffer because I don't have one yet. This does seem like good starter exercise 😄 |
Laurens, Then ran your second example with DATA1 and DATA2 with the same length. issue_1454_sender_timer.py:from pybricks.hubs import ThisHub
from pybricks.parameters import Port, Color
from pybricks.pupdevices import Motor
from pybricks.tools import wait
from urandom import randint
# Set up all devices.
hub = ThisHub(broadcast_channel=78)
# motor = Motor(Port.B)
DATA1 = bytes([1,2,3,4])
DATA2 = bytes([5,6,7,8])
# The main program starts here.
while True:
# a = bytes([i for i in range(randint(4,5))])
# hub.ble.broadcast(["ABC", a])
hub.ble.broadcast(DATA1)
wait(1)
hub.ble.broadcast(DATA2)
wait(1) issue_1454_observer_timer.py:from pybricks.hubs import ThisHub
from pybricks.parameters import Color
from pybricks.tools import wait, StopWatch
# Set up all devices.
hub = ThisHub(observe_channels=[78])
timer = StopWatch()
DATA1 = bytes([1,2,3,4])
DATA2 = bytes([5,6,7,8])
# The main program starts here.
while True:
data = hub.ble.observe(78)
if data in (DATA1, DATA2, None):
print(end=".")
else:
print("\nWrong data:", data)
print("run time:", timer.time(), "mSec\n")
raise SystemExit()
wait(10) This test now runs for over an hour and still running. Bert |
As promised.... (more or less) My PC rebooted involuntary, so no final measurement, the hubs are still sending and observing. |
Thanks Bert, good to know you're seeing the same issue 🙂 |
I think we'll have to defer this to the next release, and possibly add a note to the docs to say that it is more stable if your values are meaningful in one byte increments. |
Any news on this one? I would love to see broadcasting working correctly for the upcoming lego-events this fall :) |
It seems that the Technic Hub can occasionally send a mix of a new and previous message. So to work around it you can:
|
I've actually tried just that (but using Blocks). I have a section where there's three steps to the process, and another hub sends signals, as to which step should be done next. I've then added blocks on the receiving end to check which integer was sent (0, 1 or 2), then wait for a bit and check again. If it's still the same number, then it should continue. This should work right? It doesn't. I've had it act up and doing the steps in the wrong order, and at wrong timing, sometimes waiting almost 10 seconds before responding at all. There's another part of the program where a sliding carriage in my machine should go all the way back before the next action can occur. I let the hub controlling the carriage send a signal once it detects that the motor has finished its rotations, and also double checked that it has returned to position 0. Still, it sometimes sends a broadcast that everything is done even though it has clearly frozen and no movement has even begun. This results in the machine physically crashing into itself. Do you think there will ever be a fix or should I just look for other solutions? Using mindstorms hubs are so expensive but maybe it's my only way forward. |
That isn't really what I meant. Try to avoid bidirectional communication if you can. One way broadcasting and observing a series of integers works really well, for hours on end and up to large distances.
If you can describe your setup, maybe we can think of a way to do your setup where each hub either only sends or receives. |
Sorry if I sounded upset, I appreciate all the help I can get. :) Alright, I tried to make a detailed but "simple" explanation of my setup. It's quite complicated but I tried to boil it down to the basics. The last page describes the interactions between the hubs. It doesn't feel possible to do this without inter communication as far as I can see sadly. Ideal would probably be to run just one program and send all commands over bluetooth to the hubs like the official powered up app, but that app has so many other problems so it's way out of the question lol. Also that would probably introduce delay issues anyways. |
That is a nice challenge :-) Perhaps it is possible to start with making only some hubs unidirectional? For example by moving both sensors to one hub? The issue we're in here has the above workarounds. For the issue about getting stuck when combining observing and advertising, maybe we can introduce a way to stop observing (like you can stop advertising) so that you can consciously choose to use which to use, when. (And then you can keep it simple by having as few as possible hubs do this) |
I managed to come up with a solution for one hub to only broadcast signals. The difference is HUGE. I only had one small hickup during my 10 minute testing (Before I would have had probably 5). I don't think I can optimize it more than this, but having the possibility to stop observing upon request would probably fix it entirely if I manage to integrate it into my program. Let me know if such a feature would be possible :) |
Describe the bug
It appears that the Technic Hub can crash quickly or after some time when observing data.
This may or may not be related to the most recent fix in pybricks/pybricks-micropython@d13ca6f.
To reproduce
The following program appears to crash. Sometimes quickly, sometimes after some time.
I've mostly been running this without Pybricks Code connected (I was mostly programming the sending hub), but when trying it while connected it would also crash, but I don't think there was an exception. It disconnected so presumably rebooted.
To be complete, here is the sending program:
The text was updated successfully, but these errors were encountered: