ConnectionRecovery recovers models before consumers #1076

bollhals · 2021-08-26T12:09:00Z

I have a reproduced state in our application, where the connection recovery leads to an unrecoverable error.

Situation:

1 AutorecoverConnection + Channel
The connection dropped (plugged out the cable)
The recovery procedure starts its loop to recover
Cable is plugged in again
Connection can be established again, and the recover starts.

Now the fun begins...
While the recovery is within these lines here a RPC message gets tried to published.

Since the models have already been recovered (Line 420), the publish actually succeeds, but the server will respond with PRECONDITION_FAILED - fast reply consumer does not exist because the consumer for the replies has not yet been recovered. (Line 424)

The outcome of this is that the channel will close and rendered unusable, while the connection recovery succeeds.

So the question is, should the AutorecoveringModel already be usable again prior to the recovery being finished?

It kind of is related to #1061 as it will touch similar area.
If a temporary model would restore the topology, then at least the time between the recovery of the models and the recovery of the consumers would be shortened. (Still possible tough)

The text was updated successfully, but these errors were encountered:

michaelklishin · 2021-08-26T13:30:13Z

I would say considering a recovering channel to be useful would be generally wrong but there is no realistic way for us to block all other operations (that would be highly surprising after all these years).

michaelklishin · 2021-08-26T13:31:39Z

The "fast reply consumer" is a message from the Direct Reply-to mechanism. It's not a real consumer and it does not use a real queue. It relies on a convention in the channel state, so as channels are restarted, that state is completely lost. All this happens on the RabbitMQ node end.

bollhals · 2021-08-26T20:23:04Z

The "fast reply consumer" is a message from the Direct Reply-to mechanism. It's not a real consumer and it does not use a real queue. It relies on a convention in the channel state, so as channels are restarted, that state is completely lost. All this happens on the RabbitMQ node end.

Yes, but from the client point of view it's an active consumer that needs to be recovered before you can publish a rpc successfully. I want to have a look and check whether there's something that could be done to improve it.

bollhals · 2021-09-06T06:05:57Z

This is now fixed on 6.x branch and on main.

Would it be possible to do a bugfix release for the 6.x and close this issue then?

michaelklishin · 2021-09-06T06:51:27Z

I'll try to do it today. There were some pipeline changes forced upon us (e.g. expiring signing keys) so I cannot be certain releasing would just work.

michaelklishin · 2021-09-14T16:22:50Z

We had a go at doing a release, it required an internally used Docker image rebuild. Hopefully we can finalise it later this week. Sorry about the wait.

bollhals · 2021-09-20T14:00:49Z

Would it be possible to give a rough timeframe? (We're planning a release of our product, which we hit this issue)

This was referenced Aug 28, 2021

Recover models together with consumers #1077

Merged

fix topology recovery #1081

Merged

lukebakken self-assigned this Nov 18, 2023

lukebakken closed this as completed Nov 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ConnectionRecovery recovers models before consumers #1076

ConnectionRecovery recovers models before consumers #1076

bollhals commented Aug 26, 2021

michaelklishin commented Aug 26, 2021 •

edited

Loading

michaelklishin commented Aug 26, 2021

bollhals commented Aug 26, 2021

bollhals commented Sep 6, 2021

michaelklishin commented Sep 6, 2021 •

edited

Loading

michaelklishin commented Sep 14, 2021

bollhals commented Sep 20, 2021

ConnectionRecovery recovers models before consumers #1076

ConnectionRecovery recovers models before consumers #1076

Comments

bollhals commented Aug 26, 2021

michaelklishin commented Aug 26, 2021 • edited Loading

michaelklishin commented Aug 26, 2021

bollhals commented Aug 26, 2021

bollhals commented Sep 6, 2021

michaelklishin commented Sep 6, 2021 • edited Loading

michaelklishin commented Sep 14, 2021

bollhals commented Sep 20, 2021

michaelklishin commented Aug 26, 2021 •

edited

Loading

michaelklishin commented Sep 6, 2021 •

edited

Loading