Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault when launching master #485

Closed
elouanKeryell-Even opened this issue Jul 6, 2015 · 1 comment
Closed

Segfault when launching master #485

elouanKeryell-Even opened this issue Jul 6, 2015 · 1 comment

Comments

@elouanKeryell-Even
Copy link

Environment

OS: CentOS 7
chronos-2.3.4-1.0.81.el7.x86_64.rpm
mesos-0.22.1-1.0.centos701406.x86_64.rpm
mesosphere-zookeeper-3.4.6-0.1.20141204175332.centos7.x86_64.rpm

Bug

When starting chronos:

$ systemctl start chronos

it crashes. Here are the logs:

Jul  6 19:33:17 master-1 systemd: Stopping Chronos...
Jul  6 19:33:17 master-1 systemd: Starting Chronos...
Jul  6 19:33:17 master-1 systemd: Started Chronos.
Jul  6 19:33:17 master-1 chronos: + cmd=(run_jar)
Jul  6 19:33:17 master-1 chronos: + local cmd
Jul  6 19:33:17 master-1 chronos: + [[ -s /etc/mesos/zk ]]
Jul  6 19:33:17 master-1 chronos: + cmd+=(--zk_hosts "$(cut -d / -f 3 /etc/mesos/zk)" --master "$(cat /etc/mesos/zk)")
Jul  6 19:33:17 master-1 chronos: ++ cut -d / -f 3 /etc/mesos/zk
Jul  6 19:33:17 master-1 chronos: ++ cat /etc/mesos/zk
Jul  6 19:33:17 master-1 chronos: + [[ -d /etc/chronos/conf ]]
Jul  6 19:33:17 master-1 chronos: + read -u 9 -r -d '' path
Jul  6 19:33:17 master-1 chronos: ++ cd /etc/chronos/conf
Jul  6 19:33:17 master-1 chronos: ++ find . -type f -not -name '.*' -print0
Jul  6 19:33:17 master-1 chronos: + local name=zk_path
Jul  6 19:33:17 master-1 chronos: + element_in --zk_path
Jul  6 19:33:17 master-1 chronos: + local e
Jul  6 19:33:17 master-1 chronos: + return 1
Jul  6 19:33:17 master-1 chronos: + case "$name" in
Jul  6 19:33:17 master-1 chronos: + cmd+=("--$name" "$(< "$conf_dir/$name")")
Jul  6 19:33:17 master-1 chronos: + read -u 9 -r -d '' path
Jul  6 19:33:17 master-1 chronos: + local name=hostname
Jul  6 19:33:17 master-1 chronos: + element_in --hostname
Jul  6 19:33:17 master-1 chronos: + local e
Jul  6 19:33:17 master-1 chronos: + return 1
Jul  6 19:33:17 master-1 chronos: + case "$name" in
Jul  6 19:33:17 master-1 chronos: + cmd+=("--$name" "$(< "$conf_dir/$name")")
Jul  6 19:33:17 master-1 chronos: + read -u 9 -r -d '' path
Jul  6 19:33:17 master-1 chronos: + local name=http_port
Jul  6 19:33:17 master-1 chronos: + element_in --http_port
Jul  6 19:33:17 master-1 chronos: + local e
Jul  6 19:33:17 master-1 chronos: + return 1
Jul  6 19:33:17 master-1 chronos: + case "$name" in
Jul  6 19:33:17 master-1 chronos: + cmd+=("--$name" "$(< "$conf_dir/$name")")
Jul  6 19:33:17 master-1 chronos: + read -u 9 -r -d '' path
Jul  6 19:33:17 master-1 chronos: + logged chronos run_jar --zk_hosts 10.10.3.65:2181 --master zk://10.10.3.65:2181/mesos --zk_path /chronos --hostname master-1 --http_port 8081
Jul  6 19:33:17 master-1 chronos: + local 'token=chronos[6064]'
Jul  6 19:33:17 master-1 chronos: + shift
Jul  6 19:33:17 master-1 chronos: + exec
Jul  6 19:33:17 master-1 chronos: + exec
Jul  6 19:33:18 master-1 chronos: ++ exec logger -p user.info -t 'chronos[6064]'
Jul  6 19:33:18 master-1 chronos: ++ exec logger -p user.notice -t 'chronos[6064]'
Jul  6 19:33:18 master-1 chronos[6064]: + run_jar --zk_hosts 10.10.3.65:2181 --master zk://10.10.3.65:2181/mesos --zk_path /chronos --hostname master-1 --http_port 8081
Jul  6 19:33:18 master-1 chronos[6064]: + local 'log_format=%2$s %5$s%6$s%n'
Jul  6 19:33:18 master-1 chronos[6064]: ++ ulimit -n
Jul  6 19:33:18 master-1 chronos[6064]: + '[' 0 -eq 0 -a 1024 -lt 8192 ']'
Jul  6 19:33:18 master-1 chronos[6064]: + ulimit -n 8192
Jul  6 19:33:18 master-1 chronos[6064]: + export PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
Jul  6 19:33:18 master-1 chronos[6064]: + PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
Jul  6 19:33:18 master-1 chronos[6064]: + vm_opts=(-Djava.library.path=/usr/local/lib:/usr/lib64:/usr/lib -Djava.util.logging.SimpleFormatter.format="$log_format")
Jul  6 19:33:18 master-1 chronos[6064]: + local vm_opts
Jul  6 19:33:18 master-1 chronos[6064]: + for j_opt in '${JAVA_OPTS:-"-Xmx512m"}'
Jul  6 19:33:18 master-1 chronos[6064]: + vm_opts+=(${j_opt})
Jul  6 19:33:18 master-1 chronos[6064]: + exec java -Djava.library.path=/usr/local/lib:/usr/lib64:/usr/lib '-Djava.util.logging.SimpleFormatter.format=%2$s %5$s%6$s%n' -Xmx512m -cp /usr/bin/chronos org.apache.mesos.chronos.scheduler.Main --zk_hosts 10.10.3.65:2181 --master zk://10.10.3.65:2181/mesos --zk_path /chronos --hostname master-1 --http_port 8081
Jul  6 19:33:18 master-1 chronos[6064]: [2015-07-06 19:33:18,314] INFO --------------------- (org.apache.mesos.chronos.scheduler.Main$:26)
Jul  6 19:33:18 master-1 chronos[6064]: [2015-07-06 19:33:18,316] INFO Initializing chronos. (org.apache.mesos.chronos.scheduler.Main$:27)
Jul  6 19:33:18 master-1 chronos[6064]: [2015-07-06 19:33:18,318] INFO --------------------- (org.apache.mesos.chronos.scheduler.Main$:28)
Jul  6 19:33:20 master-1 chronos[6064]: [2015-07-06 19:33:20,512] INFO Wiring up the application (org.apache.mesos.chronos.scheduler.config.MainModule:38)
Jul  6 19:33:20 master-1 chronos[6064]: #
Jul  6 19:33:20 master-1 chronos[6064]: # A fatal error has been detected by the Java Runtime Environment:
Jul  6 19:33:20 master-1 chronos[6064]: #
Jul  6 19:33:20 master-1 chronos[6064]: #  SIGSEGV (0xb) at pc=0x00007f7c54ddf56c, pid=6064, tid=140171988526848
Jul  6 19:33:20 master-1 chronos[6064]: #
Jul  6 19:33:20 master-1 chronos[6064]: # JRE version: OpenJDK Runtime Environment (7.0_75-b13) (build 1.7.0_75-mockbuild_2015_01_21_05_53-b00)
Jul  6 19:33:20 master-1 chronos[6064]: # Java VM: OpenJDK 64-Bit Server VM (24.75-b04 mixed mode linux-amd64 compressed oops)
Jul  6 19:33:20 master-1 chronos[6064]: # Derivative: IcedTea 2.5.4
Jul  6 19:33:20 master-1 chronos[6064]: # Distribution: Built on CentOS Linux release 7.0.1406 (Core)  (Wed Jan 21 05:53:48 UTC 2015)
Jul  6 19:33:20 master-1 chronos[6064]: # Problematic frame:
Jul  6 19:33:20 master-1 chronos[6064]: # C  [libc.so.6+0x8056c]  cfree+0x1c
Jul  6 19:33:20 master-1 chronos[6064]: #
Jul  6 19:33:20 master-1 chronos[6064]: # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
Jul  6 19:33:20 master-1 chronos[6064]: #
Jul  6 19:33:20 master-1 chronos[6064]: # An error report file with more information is saved as:
Jul  6 19:33:20 master-1 chronos[6064]: # /tmp/jvm-6064/hs_error.log
Jul  6 19:33:20 master-1 chronos[6064]: #
Jul  6 19:33:20 master-1 chronos[6064]: # If you would like to submit a bug report, please include
Jul  6 19:33:20 master-1 chronos[6064]: # instructions on how to reproduce the bug and visit:
Jul  6 19:33:20 master-1 chronos[6064]: #   http://icedtea.classpath.org/bugzilla
Jul  6 19:33:20 master-1 chronos[6064]: #

Here is the generated error description file : https://gist.github.com/WinstonSureChill/a17a344b091ea5ee7ede

This is the top of the stacktrace as found in the generated error file:

Stack: [0x00007f7c55856000,0x00007f7c55957000], sp=0x00007f7c55952808, free space=1010k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [libc.so.6+0x8056c] cfree+0x1c

[error occurred during error reporting (printing native stack), id 0xb]

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j org.apache.mesos.state.ZooKeeperState.initialize(Ljava/lang/String;JLjava/util/concurrent/TimeUnit;Ljava/lang/String;)V+0
j org.apache.mesos.state.ZooKeeperState.<init>(Ljava/lang/String;JLjava/util/concurrent/TimeUnit;Ljava/lang/String;)V+11
j org.apache.mesos.chronos.scheduler.config.ZookeeperModule.provideState()Lorg/apache/mesos/state/State;+40
v ~StubRoutines::call_stub
[...]

The last java calls seem to be Zookeeper related (file https://github.com/apache/mesos/blob/master/src/java/src/org/apache/mesos/state/ZooKeeperState.java), so I'm thinking maybe I have a problem with my zookeeper configuration? Or does someone see an obvious error in the parameters passed to chronos:

Jul  6 19:33:18 master-1 chronos[6064]: + run_jar --zk_hosts 10.10.3.65:2181 --master zk://10.10.3.65:2181/mesos --zk_path /chronos --hostname master-1 --http_port 8081

Here is my zookeeper config: https://gist.github.com/WinstonSureChill/b402d07f0bbffe9b035e

And the mesos config I setup with the config files:

mesos/

zk: zk://10.10.3.65:2181/mesos
master: 10.10.3.65

mesos-master/

hostname: f1.linuxrt
ip: 10.10.3.65
quorum: 1
work_dir: /var/lib/mesos

Also, Mesos works fine on itself (without Chronos).

Related issues

My problem looks like that one (marathon+mesos): d2iq-archive/marathon#1352

@elouanKeryell-Even
Copy link
Author

I was using java openjdk 1.7. I upgraded to 1.8, reinstalled & reconfigured Zookeeper & Chronos, and now everything works fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant