-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binding manager thinks it can connect to multiple things in parallel, but it can't #21606
Comments
Why is this labeled 'spec'? |
Because:
which would be a spec violation, no? |
Had a quick test on linux platform. If we intitiate two concurrent CASE sessions, the second connection will get stuck at I'd suggest to add a queue for the connections in the CASESessionManager. |
@gjc13 can you please upload a sample of your test. I might be able to take a look sometime this week, it will be helpful to have a test in hand to reproduce the issue |
This does not line up with my understanding of this issue at all. CASESessionManager is capable of handling multiple connections at one. You can totally call From my understanding the issue here is that when |
@tehampson : |
Chatted with @tehampson about this. He's going to run some experiments to see if this issue does indeed exist. |
Just want to quickly touch on Based on discussion with Jerry, arguably when someone calls Based on a code audit there is no issue with calling
To see if there really is an issue in practice I added two hacks:
This sequence I did (I am documenting this for my own future reference):
What I discovered is that we only ever call While auditing code for this issue I have spotted some use after free and a buffer overrun. So I am going to add these fixes as well to the issue mentioned above. PR should be up soon |
@tehampson I'd recommend trying with at least two instances of the 'bulb' that the switch is controlling, and to then have two entries in the bindings cluster pointing to two different nodes. Also, I'd recommend trying with both the model where the switch boots up first, fails to connect to the two bulbs, THEN, launch the two bulbs (to prevent the initial CASE session establishment from succeeding so that we can observe the establishment process at the time the switch is flipped), as well as the model where the bulbs boot up first, then the switch to observe warmed CASE session usagse. |
Okay so when I try talking to two different nodes that are not primed I am getting the issue. The issue has to do with |
Right, the key is that a |
Problem
BindingManager::EstablishConnection
does:This will move the
mOnConnectedCallback
andmOnConnectionFailureCallback
callbacks to the callback list of the new session establishment process, if there is no session already.We call
BindingManager::NotifyBoundClusterChanged
in a loop over bindings inBindingManager::NotifyBoundClusterChanged
. This means we will only get notified for the last connection we started, not any of the other ones.In many cases this might be getting covered up by connection attempts that complete synchronously, because we have pre-warmed CASE sessions... But it can lead to messages not being sent to some bindings if we have no pre-warmed sessions around.
Proposed Solution
Move the connection callbacks to somewhere that is per-connection-attempt. Perhaps the pending notification? Or some other data structure?
The text was updated successfully, but these errors were encountered: