Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chain crash after update to version 0.63.3 #10644

Closed
AceSlash opened this issue May 2, 2018 · 2 comments
Closed

chain crash after update to version 0.63.3 #10644

AceSlash opened this issue May 2, 2018 · 2 comments

Comments

@AceSlash
Copy link

AceSlash commented May 2, 2018

Description:

Server Setup Information:

  • Version of Rocket.Chat Server: 0.63.3
  • Operating System: Ubuntu Xenial x64
  • Deployment Method(snap/docker/tar/etc): tar
  • Number of Running Instances: 1
  • DB Replicaset Oplog: no
  • Node Version: 8.11.1
  • mongoDB Version: 3.2.19

Rocketchat is installed in an lxd container.

Description of the issue.

Upgrade from 0.58.3 to 0.63.3, then wait.

Rocketchat keeps crashing. Looking at the logs in the container doesn't help much:

May 02 20:31:25 rocketchat systemd[1]: rocketchat.service: Main process exited, code=dumped, status=11/SEGV
May 02 20:31:25 rocketchat systemd[1]: rocketchat.service: Unit entered failed state.
May 02 20:31:25 rocketchat systemd[1]: rocketchat.service: Failed with result 'core-dump'.
May 02 20:31:25 rocketchat systemd[1]: rocketchat.service: Service hold-off time over, scheduling restart.
May 02 20:31:25 rocketchat systemd[1]: Stopped The Rocket.Chat server.
May 02 20:31:25 rocketchat systemd[1]: Started The Rocket.Chat server.
May 02 20:31:27 rocketchat rocketchat[10788]: Will load cache for users
May 02 20:31:28 rocketchat rocketchat[10788]: 93 records load from users
May 02 20:31:28 rocketchat rocketchat[10788]: Will load cache for rocketchat_room
May 02 20:31:28 rocketchat rocketchat[10788]: 2046 records load from rocketchat_room
May 02 20:31:28 rocketchat rocketchat[10788]: Will load cache for rocketchat_subscription
May 02 20:31:28 rocketchat rocketchat[10788]: 4562 records load from rocketchat_subscription
May 02 20:31:28 rocketchat rocketchat[10788]: Will load cache for rocketchat_settings
May 02 20:31:28 rocketchat rocketchat[10788]: 638 records load from rocketchat_settings
May 02 20:31:29 rocketchat rocketchat[10788]: Updating process.env.MAIL_URL
May 02 20:31:30 rocketchat systemd[1]: rocketchat.service: Main process exited, code=dumped, status=11/SEGV
May 02 20:31:30 rocketchat systemd[1]: rocketchat.service: Unit entered failed state.
May 02 20:31:30 rocketchat systemd[1]: rocketchat.service: Failed with result 'core-dump'.
May 02 20:31:30 rocketchat systemd[1]: rocketchat.service: Service hold-off time over, scheduling restart.
May 02 20:31:30 rocketchat systemd[1]: Stopped The Rocket.Chat server.
May 02 20:31:30 rocketchat systemd[1]: Started The Rocket.Chat server.
May 02 20:31:32 rocketchat rocketchat[10801]: Will load cache for users
May 02 20:31:33 rocketchat rocketchat[10801]: 93 records load from users
May 02 20:31:33 rocketchat rocketchat[10801]: Will load cache for rocketchat_room
May 02 20:31:33 rocketchat rocketchat[10801]: 2046 records load from rocketchat_room
May 02 20:31:33 rocketchat rocketchat[10801]: Will load cache for rocketchat_subscription
May 02 20:31:33 rocketchat rocketchat[10801]: 4562 records load from rocketchat_subscription
May 02 20:31:33 rocketchat rocketchat[10801]: Will load cache for rocketchat_settings
May 02 20:31:33 rocketchat rocketchat[10801]: 638 records load from rocketchat_settings
May 02 20:31:35 rocketchat rocketchat[10801]: Updating process.env.MAIL_URL
May 02 20:31:35 rocketchat systemd[1]: rocketchat.service: Main process exited, code=dumped, status=11/SEGV
May 02 20:31:35 rocketchat systemd[1]: rocketchat.service: Unit entered failed state.
May 02 20:31:35 rocketchat systemd[1]: rocketchat.service: Failed with result 'core-dump'.
May 02 20:31:36 rocketchat systemd[1]: rocketchat.service: Service hold-off time over, scheduling restart.
May 02 20:31:36 rocketchat systemd[1]: Stopped The Rocket.Chat server.
May 02 20:31:36 rocketchat systemd[1]: Started The Rocket.Chat server.

Looking at the host dmesg is more interesting:

[Wed May  2 20:31:25 2018] traps: node[69004] general protection ip:3c33a1ed5408 sp:7fb3e7275198 error:0
[Wed May  2 20:31:30 2018] traps: node[84108] general protection ip:2c6a8fd5408 sp:7f9851079da8 error:0
[Wed May  2 20:31:36 2018] traps: node[84519] general protection ip:37f490355408 sp:7f86df323da8 error:0
[Wed May  2 20:31:41 2018] traps: node[85135] general protection ip:2acaa5755408 sp:7f2036372df8 error:0
[Wed May  2 20:31:47 2018] traps: node[85715] general protection ip:382650d55408 sp:7f262e0e8da8 error:0
[Wed May  2 20:31:52 2018] traps: node[86121] general protection ip:fd4a7c55408 sp:7f5d37a9bda8 error:0
[Wed May  2 20:31:57 2018] traps: node[86480] general protection ip:39be72bd5408 sp:7fe4a96adda8 error:0
[Wed May  2 20:32:03 2018] traps: node[87036] general protection ip:2d5a14e55408 sp:7fc23b7eada8 error:0
[Wed May  2 20:32:08 2018] traps: node[87569] general protection ip:f17c29d5408 sp:7f700d9a1da8 error:0
[Wed May  2 20:32:13 2018] traps: node[87951] general protection ip:53e683d5408 sp:7f868063bda8 error:0
[Wed May  2 20:32:19 2018] traps: node[88517] general protection ip:1714fe2d5408 sp:7fcefe62eda8 error:0
[Wed May  2 20:32:24 2018] traps: node[89180] general protection ip:46903755408 sp:7fa25b102da8 error:0
[Wed May  2 20:32:29 2018] traps: node[89726] general protection ip:1331677d5408 sp:7fc3121b3da8 error:0
[Wed May  2 20:32:35 2018] traps: node[90203] general protection ip:2bcffe855408 sp:7f964fdcbda8 error:0
[Wed May  2 20:32:40 2018] traps: node[90779] general protection ip:3f1fa94d5408 sp:7ff5e41a4da8 error:0
[Wed May  2 20:32:45 2018] traps: node[91327] general protection ip:32392555408 sp:7f6800ce2da8 error:0
[Wed May  2 20:32:51 2018] traps: node[91938] general protection ip:206c94755408 sp:7f29abc5dda8 error:0
[Wed May  2 20:32:55 2018] traps: node[92245] general protection ip:25c03afd5408 sp:7f8d8b8f6ed0 error:0
[Wed May  2 20:33:00 2018] traps: node[92676] general protection ip:fabd0755408 sp:7fdc09e7eda8 error:0
[Wed May  2 20:33:06 2018] traps: node[93169] general protection ip:3f8bb4d5408 sp:7fd159572da8 error:0
[Wed May  2 20:33:11 2018] traps: node[94196] general protection ip:11fa0da55408 sp:7f92afff9da8 error:0
[Wed May  2 20:33:17 2018] traps: node[95440] general protection ip:17cd90f55408 sp:7fc31a3b7da8 error:0
[Wed May  2 20:33:22 2018] traps: node[96313] general protection ip:3f9ac98d5408 sp:7f500707eda8 error:0
[Wed May  2 20:33:27 2018] traps: node[97504] general protection ip:d34554d5408 sp:7f719a816eb8 error:0
[Wed May  2 20:33:32 2018] traps: node[98639] general protection ip:5ab21d5408 sp:7fa35b72bdf8 error:0
[Wed May  2 20:33:38 2018] traps: node[99582] general protection ip:2d37e51d5408 sp:7fc17689bd60 error:0
[Wed May  2 20:33:42 2018] traps: node[100464] general protection ip:32698bdd5408 sp:7fbf930e8cf8 error:0
[Wed May  2 20:33:48 2018] traps: node[101311] general protection ip:13ea50ad5408 sp:7f6273bc0da8 error:0
[Wed May  2 20:33:53 2018] traps: node[101954] general protection ip:227c6b7d5408 sp:7ffa1283fb88 error:0
[Wed May  2 20:33:58 2018] traps: node[102918] general protection ip:19380e155408 sp:7f6b98d2fda8 error:0
[Wed May  2 20:34:04 2018] traps: node[103450] general protection ip:1a9c77bd5408 sp:7fb24dcb4da8 error:0
[Wed May  2 20:34:09 2018] traps: node[104435] general protection ip:bdaa59d5408 sp:7fe676ecac18 error:0
[Wed May  2 20:34:15 2018] traps: node[104930] general protection ip:3f4be6bd5408 sp:7f124a481d60 error:0
[Wed May  2 20:34:20 2018] traps: node[105535] general protection ip:37abd5f55408 sp:7f6a44e19c18 error:0
[Wed May  2 20:34:25 2018] traps: node[106382] general protection ip:3c8145dd5408 sp:7f8ab69c8b88 error:0
[Wed May  2 20:34:31 2018] traps: node[107188] general protection ip:211a2a155408 sp:7fe605b7ada8 error:0
[Wed May  2 20:34:36 2018] traps: node[108192] general protection ip:1ef550e55408 sp:7f910cca3da8 error:0
[Wed May  2 20:50:35 2018] traps: node[108812] trap invalid opcode ip:1444d19 sp:7f164d164388 error:0 in node[400000+1b56000]
[Wed May  2 20:50:40 2018] traps: node[3612] general protection ip:163d53d55408 sp:7f0430880da8 error:0
[Wed May  2 21:02:36 2018] traps: node[4236] trap invalid opcode ip:1444d19 sp:7f237576a388 error:0 in node[400000+1b56000]
[Wed May  2 21:02:41 2018] traps: node[70952] general protection ip:cd9d0055408 sp:7fc2d1760da8 error:0

The update was tested on a clone of the production and no issue was found, but the load was very low and we may not have noticed it since systemd is restarting the daemon when it crashes.

I've made a full rollback of the server now to the old version but I'll try to reproduce it on a new copy of the production tomorrow. I have no clue as to what to do to debug nodejs at this point as it seems to be the culprit.

@ghost
Copy link

ghost commented May 2, 2018

Maybe #10060

@TwizzyDizzy
Copy link

@rocket-cat close

Closing this. Please refer to: #10331. A downgrade to nodejs 8.9.4 should be a solution for now, until the next 8.x version of nodejs will be released.

Cheers
Thomas

@rocket-cat rocket-cat bot closed this as completed May 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants