Project boot time increases by ~10 seconds with v2.3.3 vs v2.2.7 #376

tmepple · 2018-09-07T14:36:28Z

Since upgrading to v2.3.3 I've noticed my project boot time increased substantially.

Now after running mix phx.server it takes ~2 seconds to see [info] [swarm on nonode@nohost] [tracker:init] started then about 10 seconds later the next messages appear:

[info] [swarm on nonode@nohost] [tracker:cluster_wait] joining cluster..
[info] [swarm on nonode@nohost] [tracker:cluster_wait] no connected nodes, proceeding without sync

After that the Phoenix endpoint is up almost immediately.

With the same project running v2.2.7 it takes ~2 seconds to see [debug] [:nonode@nohost][Elixir.Quantum.ExecutionBroadcaster] Adding job #Reference<0.3563856806.2205417474.238830> then my endpoint is up almost immediately thereafter.

The issue seems to be related to the addition of swarm from v2.3.0+ but I couldn't find a similar issue raised in that issue tracker either. I'm seeing debug log messages from swarm and quantum but none imply a problem. Maybe it's related to a network misconfiguration and swarm spends time looking for other nodes which do not exist then times out?

The text was updated successfully, but these errors were encountered:

maennchen · 2018-09-07T17:30:13Z

@tmepple This is caused by swarm. This is the tradeoff for a real global clustering that can heal from any change in the cluster topology.

The approach before was not able to handle situations where multiple nodes started a scheduler before joining the cluster, neither did an active node take over the scheduling from a died cluster node (like a netsplit / power outage etc. when the node did not properly shut down).

tmepple · 2018-09-07T17:46:32Z

Definitely agree adding swarm was a good idea for the reasons given. I'm just wondering why it takes (on my fast dev machine) ~10 seconds to boot swarm without debug messages when running on a single node. Boot time is not a big deal for production but for development when there is no cluster it would be nice to speed it up. Are you seeing a similar delay on your projects? I will research swarm a bit more and see if there's an issue with that repo or if my machine is misconfigured somehow.

maennchen · 2018-09-07T18:04:26Z

@tmepple You should be able to save that time in dev if you disable global mode. Why it takes as long as you describe on a single node, I can not tell you unfortunately.

tmepple · 2018-09-08T21:07:34Z

@maennchen Hmmm... it still happens when I add global: true or global: false (which I think is the default) to the config. Do I need to use a certain run strategy or how should I configure it?

maennchen · 2018-09-08T23:02:32Z

@tmepple I think it is caused by the ClusterTaskSupervisorRegistry. I’ll have a look if I can make it faster if global is false.

madshargreave · 2019-01-02T11:39:13Z

@c-rack Any ETA on the above commit being released? Awesome work

c-rack · 2019-01-02T21:30:25Z

@madshargreave I could make a release on weekend.

@maennchen do you agree or do you want to do the release by yourself?

maennchen · 2019-01-03T06:16:11Z

@c-rack I wanted to do the next release together with the clustering fixes. Since that seems to take ages, it seems to be a good idea to release anyways.

You‘re welcome to do a release.

Fixes #376

maennchen self-assigned this Sep 7, 2018

maennchen added the question label Sep 7, 2018

tmepple closed this as completed Sep 7, 2018

maennchen reopened this Sep 8, 2018

maennchen mentioned this issue Nov 21, 2018

Solution: Faster startup for non-global #383

Merged

c-rack pushed a commit that referenced this issue Nov 22, 2018

Solution: Faster startup for non-global (#376) (#383)

7c2fa0c

c-rack added a commit that referenced this issue Jan 6, 2019

Release 2.3.4

2a84a01

Fixes #376

c-rack mentioned this issue Jan 6, 2019

Release 2.3.4 #391

Merged

c-rack closed this as completed in #391 Jan 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Project boot time increases by ~10 seconds with v2.3.3 vs v2.2.7 #376

Project boot time increases by ~10 seconds with v2.3.3 vs v2.2.7 #376

tmepple commented Sep 7, 2018

maennchen commented Sep 7, 2018

tmepple commented Sep 7, 2018

maennchen commented Sep 7, 2018

tmepple commented Sep 8, 2018

maennchen commented Sep 8, 2018

madshargreave commented Jan 2, 2019

c-rack commented Jan 2, 2019

maennchen commented Jan 3, 2019

Project boot time increases by ~10 seconds with v2.3.3 vs v2.2.7 #376

Project boot time increases by ~10 seconds with v2.3.3 vs v2.2.7 #376

Comments

tmepple commented Sep 7, 2018

maennchen commented Sep 7, 2018

tmepple commented Sep 7, 2018

maennchen commented Sep 7, 2018

tmepple commented Sep 8, 2018

maennchen commented Sep 8, 2018

madshargreave commented Jan 2, 2019

c-rack commented Jan 2, 2019

maennchen commented Jan 3, 2019