Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Carrier crashes if a relay has been created but was previously rejected #1034

Closed
vanstee opened this issue Oct 11, 2016 · 2 comments
Closed
Assignees
Labels
Milestone

Comments

@vanstee
Copy link
Member

vanstee commented Oct 11, 2016

To repro this bug, start Cog without creating a Relay. Then start Relay and watch it try to connect to Cog; you should see a Cannot SUBSCRIBE ... ACL Deny in the logs. Now, without restarting Relay, create a relay for that Relay's id (cogctl relays create ...) and you should see Carrier crash with the following output.

2016-10-11T11:38:59.0897  (:emqttd_protocol:205) [error] Client([email protected]:49755): Cannot SUBSCRIBE [{<<"bot/relays/555a39cc-8fc8-11e6-b4ff-fff5ae5198c2/announcer">>,1}] for ACL Deny
2016-10-11T11:38:59.0899  (:emqttd_protocol:205) [error] Client([email protected]:49754): Cannot SUBSCRIBE [{<<"bot/relays/555a39cc-8fc8-11e6-b4ff-fff5ae5198c2/directives">>,1}] for ACL Deny
2016-10-11T11:38:59.0901  (:emqttd_protocol:205) [error] Client([email protected]:49754): Cannot SUBSCRIBE [{<<"/bot/commands/555a39cc-8fc8-11e6-b4ff-fff5ae5198c2/#">>,1}] for ACL Deny
2016-10-11T11:38:59.0951  (Cog.Relay.Info:33) [info] Starting relay information service
2016-10-11T11:38:59.0968  () [error] GenServer Cog.Relay.Info terminating
** (FunctionClauseError) no function clause matching in Carrier.Messaging.Connection.publish/3
    (cog) lib/carrier/messaging/connection.ex:147: Carrier.Messaging.Connection.publish(%Carrier.Messaging.Connection{call_reply: "carrier/call/reply/bd9d2ad7488e40c28514a3c713a213cd", conn: #PID<0.617.0>, id: "bd9d2ad7488e40c28514
a3c713a213cd"}, %{error: "Relay with id 555a39cc-8fc8-11e6-b4ff-fff5ae5198c2 was not recognized."}, [routed_by: "bot/relays/555a39cc-8fc8-11e6-b4ff-fff5ae5198c2/directives"])
    (cog) lib/cog/relay/info.ex:48: Cog.Relay.Info.handle_info/2
    (stdlib) gen_server.erl:615: :gen_server.try_dispatch/4
    (stdlib) gen_server.erl:681: :gen_server.handle_msg/5
    (stdlib) proc_lib.erl:240: :proc_lib.init_p_do_apply/3
Last message: {:publish, "bot/relays/info", "{\"list_bundles\":{\"relay_id\":\"555a39cc-8fc8-11e6-b4ff-fff5ae5198c2\",\"reply_to\":\"bot/relays/555a39cc-8fc8-11e6-b4ff-fff5ae5198c2/directives\"}}"}
State: %Cog.Relay.Info{mq_conn: %Carrier.Messaging.Connection{call_reply: "carrier/call/reply/bd9d2ad7488e40c28514a3c713a213cd", conn: #PID<0.617.0>, id: "bd9d2ad7488e40c28514a3c713a213cd"}}
2016-10-11T11:38:59.0992  () [error] ** State machine #PID<0.617.0> terminating 
** Last message in was {:EXIT, #PID<0.616.0>,
 {:function_clause,
  [{Carrier.Messaging.Connection, :publish,
    [%Carrier.Messaging.Connection{call_reply: "carrier/call/reply/bd9d2ad7488e40c28514a3c713a213cd",
      conn: #PID<0.617.0>, id: "bd9d2ad7488e40c28514a3c713a213cd"},
     %{error: "Relay with id 555a39cc-8fc8-11e6-b4ff-fff5ae5198c2 was not recognized."},
     [routed_by: "bot/relays/555a39cc-8fc8-11e6-b4ff-fff5ae5198c2/directives"]],
    [file: 'lib/carrier/messaging/connection.ex', line: 147]},
   {Cog.Relay.Info, :handle_info, 2, [file: 'lib/cog/relay/info.ex', line: 48]},
   {:gen_server, :try_dispatch, 4, [file: 'gen_server.erl', line: 615]},
   {:gen_server, :handle_msg, 5, [file: 'gen_server.erl', line: 681]},
   {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 240]}]}}
** When State == :connected
**      Data  == {:state, #PID<0.616.0>, '<0.617.0>', {127, 0, 0, 1}, 1883, #Port<0.55821>,
 #PID<0.618.0>,
 {:proto_state, #Port<0.55821>, '127.0.0.1:64237', 4, "MQTT",
  "emqttc_Patricks-MacBook-Pro_831d87529dcbe7737690", true, 90, false,
  {:mqtt_message, 0, false, false, :undefined, :undefined, :undefined},
  :undefined, :undefined, 3,
  %{"bot/relays/info" => 1,
    "carrier/call/reply/bd9d2ad7488e40c28514a3c713a213cd" => 0}, %{}, %{}, %{},
  {:gen_logger, :lager_logger, 8}}, [{#PID<0.616.0>, #Reference<0.0.1.1774>}],
 %{"bot/relays/info" => {1, [#PID<0.616.0>]},
   "carrier/call/reply/bd9d2ad7488e40c28514a3c713a213cd" => {0,
    [#PID<0.616.0>]}}, [], [], %{}, 2, false,
 {:keepalive, #Port<0.55821>, :send_oct, 190, 90, {:keepalive, :timeout},
  #Reference<0.0.1.47798>}, 90, 60, 4, 8, :undefined, :tcp, :undefined,
 {:gen_logger, :lager_logger, 8}, [], []}
** Reason for termination = 
** {:function_clause,
 [{Carrier.Messaging.Connection, :publish,
   [%Carrier.Messaging.Connection{call_reply: "carrier/call/reply/bd9d2ad7488e40c28514a3c713a213cd",
     conn: #PID<0.617.0>, id: "bd9d2ad7488e40c28514a3c713a213cd"},
    %{error: "Relay with id 555a39cc-8fc8-11e6-b4ff-fff5ae5198c2 was not recognized."},
    [routed_by: "bot/relays/555a39cc-8fc8-11e6-b4ff-fff5ae5198c2/directives"]],
   [file: 'lib/carrier/messaging/connection.ex', line: 147]},
  {Cog.Relay.Info, :handle_info, 2, [file: 'lib/cog/relay/info.ex', line: 48]},
  {:gen_server, :try_dispatch, 4, [file: 'gen_server.erl', line: 615]},
  {:gen_server, :handle_msg, 5, [file: 'gen_server.erl', line: 681]},
  {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 240]}]}
@vanstee vanstee added the bug label Oct 11, 2016
@vanstee vanstee added this to the Cog 0.16 milestone Oct 11, 2016
@christophermaier christophermaier self-assigned this Oct 13, 2016
@christophermaier
Copy link
Collaborator

I seem to be able to replicate the crash just by starting up a relay that Cog doesn't know about!

@kevsmith kevsmith modified the milestones: Cog 0.15.2, Cog 0.16 Oct 14, 2016
@christophermaier
Copy link
Collaborator

It looks like the real issue is that Relays connecting to the bus from the same machine, such that their peer host is 127.0.0.1 end up being let through (see

{{127, 0, 0, 1}, _} ->
). This is why I get a crash when running Cog and Relay locally, prior to registering the Relay; they're on the same host, so the Relay is allowed to connect. The crash comes later from code that doesn't expect to be dealing with unregistered Relays.

The intent of the 127.0.0.1 matching condition was to let the clients Cog itself starts up to connect to the bus unencumbered. Locally I've tweaked the criteria to allow only clients through whose IDs start with "emqttc" (just for testing; that's not a fix) and the proper behavior is demonstrated. Cog's connections go through (because they're initiated from the emqttc library, and that's what its client IDs start with), but unregistered Relay connections (which are not implemented using emqttc) are properly denied.

I'm going to think a bit on how to get a better demarcation between internal Cog message bus clients and everybody else.

christophermaier added a commit that referenced this issue Oct 14, 2016
Previously, only connections coming from Relays had usernames and
passwords associated with them; any connection from Cog itself had
neither. We would distinguish between them by examining which host the
connection was coming from; if it was 127.0.0.1, we treated it like it
was a Cog connection, and like a Relay connection (which needed to be
authenticated) otherwise.

This would cause Cog to crash if a Relay process was running on the same
host as Cog, and if it tried to connect prior to being registered with
Cog.

Now, _all_ connections have a username and password associated with
them. All Cog-initiated connections have a username of `COG_INTERNAL`,
and use a randomly-generated password that is generated at Cog startup,
allowing us to properly distinguish between Cog connections and Relay
connections.

Fixes #1034
christophermaier added a commit that referenced this issue Oct 17, 2016
Previously, only connections coming from Relays had usernames and
passwords associated with them; any connection from Cog itself had
neither. We would distinguish between them by examining which host the
connection was coming from; if it was 127.0.0.1, we treated it like it
was a Cog connection, and like a Relay connection (which needed to be
authenticated) otherwise.

This would cause Cog to crash if a Relay process was running on the same
host as Cog, and if it tried to connect prior to being registered with
Cog.

Now, _all_ connections have a username and password associated with
them. All Cog-initiated connections have a username of `COG_INTERNAL`,
and use a randomly-generated password that is generated at Cog startup,
allowing us to properly distinguish between Cog connections and Relay
connections.

Fixes #1034
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants