Skip to content

Commit

Permalink
rabbit_quorum_queue: Wait for member add in add_member/4
Browse files Browse the repository at this point in the history
[Why]
The `ra:member_add/3` call returns before the change is committed. This
is ok for that addition but any follow-up changes to the cluster might
be rejected with the `cluster_change_not_permitted` error.

[How]
Instead of changing other places to wait or retry their cluster
membership change, this patch waits for the current add to be applied
before proceeding and returning.

This fixes some transient failures in CI where such follow-up changes
are rejected and not retried, leaving the cluster in an unexpected state
for the testcase.

An example is with
`quorum_queue_SUITE:force_shrink_member_to_current_member/1`
  • Loading branch information
dumbbell committed Nov 28, 2024
1 parent c4d9a84 commit 99d8e90
Showing 1 changed file with 14 additions and 1 deletion.
15 changes: 14 additions & 1 deletion deps/rabbit/src/rabbit_quorum_queue.erl
Original file line number Diff line number Diff line change
Expand Up @@ -1346,14 +1346,27 @@ add_member(Q, Node, Membership, Timeout) when ?amqqueue_is_quorum(Q) ->
maps:get(id, Conf)
end,
case ra:add_member(Members, ServerIdSpec, Timeout) of
{ok, _, Leader} ->
{ok, {RaIndex, RaTerm}, Leader} ->
Fun = fun(Q1) ->
Q2 = update_type_state(
Q1, fun(#{nodes := Nodes} = Ts) ->
Ts#{nodes => [Node | Nodes]}
end),
amqqueue:set_pid(Q2, Leader)
end,
%% The `ra:member_add/3` call above returns before the
%% change is committed. This is ok for that addition but
%% any follow-up changes to the cluster might be rejected
%% with the `cluster_change_not_permitted` error.
%%
%% Instead of changing other places to wait or retry their
%% cluster membership change, we wait for the current add
%% to be applied using a conditional leader query before
%% proceeding and returning.
{ok, _, _} = ra:leader_query(
Leader,
{erlang, is_list, []},
#{condition => {applied, {RaIndex, RaTerm}}}),
_ = rabbit_amqqueue:update(QName, Fun),
rabbit_log:info("Added a replica of quorum ~ts on node ~ts", [rabbit_misc:rs(QName), Node]),
ok;
Expand Down

0 comments on commit 99d8e90

Please sign in to comment.