Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Project boot time increases by ~10 seconds with v2.3.3 vs v2.2.7 #376

Closed
tmepple opened this issue Sep 7, 2018 · 8 comments
Closed

Project boot time increases by ~10 seconds with v2.3.3 vs v2.2.7 #376

tmepple opened this issue Sep 7, 2018 · 8 comments
Assignees
Labels

Comments

@tmepple
Copy link

tmepple commented Sep 7, 2018

Since upgrading to v2.3.3 I've noticed my project boot time increased substantially.

Now after running mix phx.server it takes ~2 seconds to see [info] [swarm on nonode@nohost] [tracker:init] started then about 10 seconds later the next messages appear:

[info] [swarm on nonode@nohost] [tracker:cluster_wait] joining cluster..
[info] [swarm on nonode@nohost] [tracker:cluster_wait] no connected nodes, proceeding without sync

After that the Phoenix endpoint is up almost immediately.

With the same project running v2.2.7 it takes ~2 seconds to see [debug] [:nonode@nohost][Elixir.Quantum.ExecutionBroadcaster] Adding job #Reference<0.3563856806.2205417474.238830> then my endpoint is up almost immediately thereafter.

The issue seems to be related to the addition of swarm from v2.3.0+ but I couldn't find a similar issue raised in that issue tracker either. I'm seeing debug log messages from swarm and quantum but none imply a problem. Maybe it's related to a network misconfiguration and swarm spends time looking for other nodes which do not exist then times out?

@maennchen
Copy link
Member

@tmepple This is caused by swarm. This is the tradeoff for a real global clustering that can heal from any change in the cluster topology.

The approach before was not able to handle situations where multiple nodes started a scheduler before joining the cluster, neither did an active node take over the scheduling from a died cluster node (like a netsplit / power outage etc. when the node did not properly shut down).

@maennchen maennchen self-assigned this Sep 7, 2018
@tmepple
Copy link
Author

tmepple commented Sep 7, 2018

Definitely agree adding swarm was a good idea for the reasons given. I'm just wondering why it takes (on my fast dev machine) ~10 seconds to boot swarm without debug messages when running on a single node. Boot time is not a big deal for production but for development when there is no cluster it would be nice to speed it up. Are you seeing a similar delay on your projects? I will research swarm a bit more and see if there's an issue with that repo or if my machine is misconfigured somehow.

@tmepple tmepple closed this as completed Sep 7, 2018
@maennchen
Copy link
Member

@tmepple You should be able to save that time in dev if you disable global mode. Why it takes as long as you describe on a single node, I can not tell you unfortunately.

@tmepple
Copy link
Author

tmepple commented Sep 8, 2018

@maennchen Hmmm... it still happens when I add global: true or global: false (which I think is the default) to the config. Do I need to use a certain run strategy or how should I configure it?

@maennchen
Copy link
Member

@tmepple I think it is caused by the ClusterTaskSupervisorRegistry. I’ll have a look if I can make it faster if global is false.

@madshargreave
Copy link

@c-rack Any ETA on the above commit being released? Awesome work

@c-rack
Copy link
Member

c-rack commented Jan 2, 2019

@madshargreave I could make a release on weekend.

@maennchen do you agree or do you want to do the release by yourself?

@maennchen
Copy link
Member

@c-rack I wanted to do the next release together with the clustering fixes. Since that seems to take ages, it seems to be a good idea to release anyways.

You‘re welcome to do a release.

c-rack added a commit that referenced this issue Jan 6, 2019
@c-rack c-rack mentioned this issue Jan 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants