-
-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Project boot time increases by ~10 seconds with v2.3.3 vs v2.2.7 #376
Comments
@tmepple This is caused by swarm. This is the tradeoff for a real global clustering that can heal from any change in the cluster topology. The approach before was not able to handle situations where multiple nodes started a scheduler before joining the cluster, neither did an active node take over the scheduling from a died cluster node (like a netsplit / power outage etc. when the node did not properly shut down). |
Definitely agree adding swarm was a good idea for the reasons given. I'm just wondering why it takes (on my fast dev machine) ~10 seconds to boot swarm without debug messages when running on a single node. Boot time is not a big deal for production but for development when there is no cluster it would be nice to speed it up. Are you seeing a similar delay on your projects? I will research swarm a bit more and see if there's an issue with that repo or if my machine is misconfigured somehow. |
@tmepple You should be able to save that time in dev if you disable global mode. Why it takes as long as you describe on a single node, I can not tell you unfortunately. |
@maennchen Hmmm... it still happens when I add |
@tmepple I think it is caused by the ClusterTaskSupervisorRegistry. I’ll have a look if I can make it faster if global is false. |
@c-rack Any ETA on the above commit being released? Awesome work |
@madshargreave I could make a release on weekend. @maennchen do you agree or do you want to do the release by yourself? |
@c-rack I wanted to do the next release together with the clustering fixes. Since that seems to take ages, it seems to be a good idea to release anyways. You‘re welcome to do a release. |
Since upgrading to v2.3.3 I've noticed my project boot time increased substantially.
Now after running
mix phx.server
it takes ~2 seconds to see[info] [swarm on nonode@nohost] [tracker:init] started
then about 10 seconds later the next messages appear:After that the Phoenix endpoint is up almost immediately.
With the same project running v2.2.7 it takes ~2 seconds to see
[debug] [:nonode@nohost][Elixir.Quantum.ExecutionBroadcaster] Adding job #Reference<0.3563856806.2205417474.238830>
then my endpoint is up almost immediately thereafter.The issue seems to be related to the addition of swarm from v2.3.0+ but I couldn't find a similar issue raised in that issue tracker either. I'm seeing debug log messages from swarm and quantum but none imply a problem. Maybe it's related to a network misconfiguration and swarm spends time looking for other nodes which do not exist then times out?
The text was updated successfully, but these errors were encountered: