-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[darwin_framework_tool] Add a shortcut (CTL('^')) to restart the stac… #22268
[darwin_framework_tool] Add a shortcut (CTL('^')) to restart the stac… #22268
Conversation
I suspect that #22245 should have fixed the second crash (the one with the stack trace) but I have not tried yet. |
| I suspect that #22245 should have fixed the second crash (the one with the stack trace) but I have not tried yet. Seems like there is still a crash, but happening a little bit later now:
|
PR #22268: Size comparison from 9bde9e3 to 9a804b4 Increases (3 builds for cc13x2_26x2, telink)
Decreases (4 builds for bl602, cc13x2_26x2, nrfconnect, psoc6)
Full report (43 builds for bl602, cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, nrfconnect, psoc6, telink)
|
Accepted for 1.0: changes in |
examples/darwin-framework-tool/commands/common/CHIPCommandBridge.mm
Outdated
Show resolved
Hide resolved
examples/darwin-framework-tool/commands/common/CHIPCommandBridge.mm
Outdated
Show resolved
Hide resolved
Fwiw, just tested using the above excellent steps to reproduce, and I can reproduce the crash and can also confirm that #22282 does not fix it. |
Before project-chip#21256 AutoCommissioner used the operational proxy if it existed at all. This could happen even if it was disconnected, as long as it had been connected at some point in the past. This was accidentally changed to "use the operational proxy only if it's connected" in project-chip#21256. This can lead to a crash, as described in project-chip#22268 (comment), if shutdown happens after the operational proxy is connected but before we get a response to CommissioningComplete. In that case, we will evict our CASE session, which will error out the CommissioningComplete command we sent and try to clean up, but it will select the (now dangling!) mCommissioneeDeviceProxy instead of correctly selecting mOperationalDeviceProxy, because the mOperationalDeviceProxy no longer has a session at that point. The fix is to check for an "initialized" (in the sense that it has a valid peer node id) mOperationalDeviceProxy instead of checking for a connected one. This matches the semantics of the check we used to have before project-chip#21256. Fixes project-chip#22293
#22294 will fix that. |
…k while in interactive mode
9a804b4
to
ac532d1
Compare
PR #22268: Size comparison from 0a36d9f to ac532d1 Increases (1 build for cc13x2_26x2)
Decreases (9 builds for bl602, cc13x2_26x2, nrfconnect, psoc6, qpg, telink)
Full report (34 builds for bl602, cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, nrfconnect, psoc6, qpg, telink)
|
Before #21256 AutoCommissioner used the operational proxy if it existed at all. This could happen even if it was disconnected, as long as it had been connected at some point in the past. This was accidentally changed to "use the operational proxy only if it's connected" in #21256. This can lead to a crash, as described in #22268 (comment), if shutdown happens after the operational proxy is connected but before we get a response to CommissioningComplete. In that case, we will evict our CASE session, which will error out the CommissioningComplete command we sent and try to clean up, but it will select the (now dangling!) mCommissioneeDeviceProxy instead of correctly selecting mOperationalDeviceProxy, because the mOperationalDeviceProxy no longer has a session at that point. The fix is to check for an "initialized" (in the sense that it has a valid peer node id) mOperationalDeviceProxy instead of checking for a connected one. This matches the semantics of the check we used to have before #21256. Fixes #22293
Before project-chip#21256 AutoCommissioner used the operational proxy if it existed at all. This could happen even if it was disconnected, as long as it had been connected at some point in the past. This was accidentally changed to "use the operational proxy only if it's connected" in project-chip#21256. This can lead to a crash, as described in project-chip#22268 (comment), if shutdown happens after the operational proxy is connected but before we get a response to CommissioningComplete. In that case, we will evict our CASE session, which will error out the CommissioningComplete command we sent and try to clean up, but it will select the (now dangling!) mCommissioneeDeviceProxy instead of correctly selecting mOperationalDeviceProxy, because the mOperationalDeviceProxy no longer has a session at that point. The fix is to check for an "initialized" (in the sense that it has a valid peer node id) mOperationalDeviceProxy instead of checking for a connected one. This matches the semantics of the check we used to have before project-chip#21256. Fixes project-chip#22293
…k while in interactive mode (project-chip#22268)
…k while in interactive mode
Problem
I'm trying to reproduce some crashes that happens on darwin. Some crashes seems related to the stack trying to shutting down.
With the current patch and the following steps I can reproduce 2 crashes:
The "right time" is hard to get since operations are going very fast locally.
To make it easier to get some crashes, one can drop some
sleep
at the right place on the server side.For example
sleep(5)
at the beginning ofemberAfGeneralCommissioningClusterArmFailSafeCallback
should allow to reproduce #21811.But adding
sleep(5)
at the beginning ofemberAfGeneralCommissioningClusterCommissioningCompleteCallback
should get you a different stack. Pretty similar to #21130 but not coming from a timer firing.For reference the stack is:
Change overview
darwin-framework-tool
in order to restart the stack. Using a shortcut to make it fast.Testing
Aș explained in the description I have used this setup to reproduce some crashes.
I am pretty sure that those are not related to darwin, one may implement a similar mechanism for
chip-tool
.I would like to land that in
master
as I believed it does not affect the core SDK but only provide another way of reproducing some SDK behaviour via a tool.