Skip to content
This repository has been archived by the owner on Jan 30, 2020. It is now read-only.

units with underscores in name break systemd-dbus #579

Closed
jonboulle opened this issue Jun 25, 2014 · 3 comments · Fixed by #595
Closed

units with underscores in name break systemd-dbus #579

jonboulle opened this issue Jun 25, 2014 · 3 comments · Fixed by #595
Milestone

Comments

@jonboulle
Copy link
Contributor

First reported as one of the issues in #477, I think I've finally found a case which consistently causes this issue:

core@core-01 ~ $ cat /etc/os-release 
NAME=CoreOS
ID=coreos
VERSION=343.0.0
VERSION_ID=343.0.0
BUILD_ID=
PRETTY_NAME="CoreOS 343.0.0"
ANSI_COLOR="1;32"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://github.com/coreos/bugs/issues"
core@core-01 ~ $ cat /var/lib/coreos-vagrant/vagrantfile-user-data 
#cloud-config

coreos:
  etcd:
      addr: $public_ipv4:4001
      peer-addr: $public_ipv4:7001
  units:
    - name: etcd.service
      command: start
    - name: fleet.service
      command: start
core@core-01 ~ $ cat > account_db.service << EOF
> [Unit]
> Description=Account DB
> After=docker.service
> Requires=docker.service
>  
> [Service]
> ExecStart=/usr/bin/docker run --name account_db postgres
> ExecStop=/usr/bin/docker rm -f account_db
> EOF
core@core-01 ~ $ fleetctl start account_db.service 
<hangs>
^Ccore@core-01 ~ $
core@core-01 ~ $ journalctl -u fleet -f
-- Logs begin at Wed 2014-06-25 22:33:23 UTC. --
Jun 25 22:35:35 core-01 fleet[3345]: I0625 22:35:35.889892 03345 event.go:82] EventJobBidSubmitted(account_db.service): successfully scheduled Job to Machine(97983f73776c437cbe3ce0955b3d61a7)
Jun 25 22:35:35 core-01 fleet[3345]: I0625 22:35:35.891778 03345 agent.go:480] Job(account_db.service) has requirements: map[]
Jun 25 22:35:35 core-01 fleet[3345]: I0625 22:35:35.891779 03345 agent.go:264] Loading Job(account_db.service)
Jun 25 22:35:35 core-01 fleet[3345]: I0625 22:35:35.891997 03345 manager.go:161] Writing systemd unit account_db.service
Jun 25 22:35:35 core-01 fleet[3345]: I0625 22:35:35.892912 03345 manager.go:137] Instructing systemd to reload units
Jun 25 22:35:35 core-01 fleet[3345]: E0625 22:35:35.947027 03345 agent.go:277] Failed fetching state of Unit(account_db.service): Message did not receive a reply (timeout by message bus)
Jun 25 22:35:35 core-01 fleet[3345]: I0625 22:35:35.947279 03345 agent.go:598] Bidding for all possible peers of Job(account_db.service)
Jun 25 22:35:35 core-01 fleet[3345]: I0625 22:35:35.948514 03345 agent.go:606] Job(account_db.service) loaded, now starting it
Jun 25 22:36:00 core-01 fleet[3345]: E0625 22:36:00.952858 03345 manager.go:104] Failed to start systemd unit account_db.service: Activation of org.freedesktop.systemd1 timed out
Jun 25 22:36:00 core-01 fleet[3345]: E0625 22:36:00.952912 03345 event.go:31] Received error from dbus: err=Activation of org.freedesktop.systemd1 timed out
Jun 25 22:36:25 core-01 fleet[3345]: E0625 22:36:25.974251 03345 event.go:31] Received error from dbus: err=Activation of org.freedesktop.systemd1 timed out
Jun 25 22:36:52 core-01 fleet[3345]: E0625 22:36:52.900691 03345 event.go:31] Received error from dbus: err=Activation of org.freedesktop.systemd1 timed out
<... repeat ad infinitum...>

now systemd-dbus is stuck!

core@core-01 ~ $ systemctl status
Failed to read server status: Connection timed out
core@core-01 ~ $ systemctl list-jobs
Failed to list jobs: Activation of org.freedesktop.systemd1 timed out
core@core-01 ~ $ journalctl -u dbus -n 10
-- Logs begin at Wed 2014-06-25 22:33:23 UTC, end at Wed 2014-06-25 22:45:05 UTC. --
Jun 25 22:43:08 core-01 dbus[2924]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out
Jun 25 22:43:08 core-01 dbus-daemon[2924]: dbus[2924]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out
Jun 25 22:43:33 core-01 dbus[2924]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out
Jun 25 22:43:33 core-01 dbus-daemon[2924]: dbus[2924]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out
Jun 25 22:43:58 core-01 dbus[2924]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out
Jun 25 22:43:58 core-01 dbus-daemon[2924]: dbus[2924]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out
Jun 25 22:44:23 core-01 dbus[2924]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out
Jun 25 22:44:23 core-01 dbus-daemon[2924]: dbus[2924]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out
Jun 25 22:44:48 core-01 dbus[2924]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out
Jun 25 22:44:48 core-01 dbus-daemon[2924]: dbus[2924]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out

sudo systemctl status works because then it talks directly to /run/systemd/private. So, systemd isn't completely hosed, just its dbus listener, it seems.

@litch
Copy link

litch commented Jun 25, 2014

👍

@jonboulle
Copy link
Contributor Author

Tracking root cause in coreos/go-systemd#49

@tslater
Copy link

tslater commented Jun 2, 2015

Running Kubernetes I got this:

  Tue, 02 Jun 2015 08:32:17 -0600   Tue, 02 Jun 2015 08:32:17 -0600 1   {kubelet 172.17.8.102}      failedSync  Error syncing pod, skipping: API error (500): Cannot start container 88aaf60385184354eb8b6371a4018dcc1db2ed491bf420ed72613b96fb16ac09: [8] System error: Activation of org.freedesktop.systemd1 timed out

Is it related?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants