Add ConditionPathExists=/etc/initrd-release to targets #140

cgwalters · 2019-12-05T14:19:05Z

I noticed that several of our units actually run twice, including
ignition-files.service (!) and ignition-ostree-mount-firstboot-sysroot.service.

Looking deeper into this, I discovered a bit of the initrd that
I didn't even know existed, which is initrd-parse-etc.service
that actually runs ExecStart=-/usr/bin/systemctl --no-block start initrd-fs.target
post switch root, after services that are part of that target
have already run (and then been shut down).

This adds a big wrinkle to my understanding of system bootup.

Then I noticed that initrd-fs.target has:
ConditionPathExists=/etc/initrd-release
which will be false post-switchroot.

The semantics here are...strange. It seems like the goal is to ensure
that the rootfs is definitely still mounted, perhaps in
very complex rootfs scenarios? In our case it will still be
mounted, we definitely don't want to rerun any of our Ignition
units.

Adding the same condition to our targets ensures that they
aren't re-run post switchroot.

cgwalters · 2019-12-05T15:44:22Z

xref https://lists.freedesktop.org/archives/systemd-devel/2019-December/043790.html

jlebon · 2019-12-05T16:19:03Z

that actually runs ExecStart=-/usr/bin/systemctl --no-block start initrd-fs.target
post switch root, after services that are part of that target
have already run (and then been shut down).

Oh wow. That's... scary. It's weird though, that unit should only be running in the initrd itself (from bootup(7)). CL actually used to have ordering wrt this in the past. E.g. ignition-files.service had:

Before=initrd-parse-etc.service
After=initrd-root-fs.target

Seems like a systemd regression put in in the real root by mistake?

jlebon · 2019-12-05T16:26:45Z

I think there's some kind of racy thing going on. If I boot the FCOS I just built locally normally:

[root@coreos ~]# journalctl -b 0 | grep initrd-parse-etc.service
Dec 05 16:24:29 coreos systemd[1]: initrd-parse-etc.service: Succeeded.
Dec 05 16:24:31 coreos systemd[1]: initrd-parse-etc.service: Succeeded.
[root@coreos ~]# journalctl -b 0 | grep 'files passed'
Dec 05 16:24:29 coreos ignition[780]: INFO     : files: files passed
Dec 05 16:24:31 coreos ignition[780]: INFO     : files: files passed

If I stop the boot using rd.break and then resume:

[root@coreos ~]# journalctl -b 0 | grep initrd-parse-etc.service
Dec 05 16:25:17 coreos systemd[1]: initrd-parse-etc.service: Succeeded.
[root@coreos ~]# journalctl -b 0 | grep 'files passed'
Dec 05 16:25:17 coreos ignition[780]: INFO     : files: files passed

So I think this might be systemd getting confused somehow.

jlebon · 2019-12-05T16:30:46Z

Ohhh, I think this might just be the journal getting confused. Notice the pid is the same here:

Dec 05 16:24:29 coreos ignition[780]: INFO     : files: files passed
Dec 05 16:24:31 coreos ignition[780]: INFO     : files: files passed

And it's the same in other tests that I've run. So I don't think ignition-files.service is actually running twice. Otherwise, I would've expected to see errors like e.g. "file already exists and overwrite is off" by now.

jlebon · 2019-12-05T16:36:55Z

And -o json-pretty reveals why:

# journalctl -b 0 --grep="files passed" -o json-pretty
{
        "SYSLOG_FACILITY" : "3",
        "_EXE" : "/usr/bin/ignition",
        "_MACHINE_ID" : "f3a1c3b65c0e4e10bf5c511499720c51",
        "_PID" : "781",
        "_GID" : "0",
        "_CAP_EFFECTIVE" : "3fffffffff",
        "_SYSTEMD_SLICE" : "system.slice",
        "__REALTIME_TIMESTAMP" : "1575563615190378",
        "SYSLOG_IDENTIFIER" : "ignition",
        "__MONOTONIC_TIMESTAMP" : "6220168",
        "_COMM" : "ignition",
        "_STREAM_ID" : "fdf84b64a8f04861b4c996a44f047be8",
        "__CURSOR" : "s=e030078e2be04702951333e34e93cfa1;i=2f2;b=96c10fee3e024fe88621d9e9b77c2d61;m=5ee988;t=598f77d9abd6a;x=c30effcfe9d>
        "_BOOT_ID" : "96c10fee3e024fe88621d9e9b77c2d61",
        "MESSAGE" : "INFO     : files: files passed",
        "_SELINUX_CONTEXT" : "kernel",
        "_SYSTEMD_CGROUP" : "/system.slice/ignition-files.service",
        "PRIORITY" : "6",
        "_SYSTEMD_INVOCATION_ID" : "1951a787387740a5b7aa64cd67ff5559",
        "_TRANSPORT" : "stdout",
        "_CMDLINE" : "/usr/bin/ignition --root=/sysroot --platform=qemu --stage=files --log-to-stdout",
        "_HOSTNAME" : "coreos",
        "_SYSTEMD_UNIT" : "ignition-files.service",
        "_UID" : "0"
}
{
        "_TRANSPORT" : "kernel",
        "_MACHINE_ID" : "d8e40229bf814913af3886c2aaa33f95",
        "SYSLOG_IDENTIFIER" : "ignition",
        "_BOOT_ID" : "96c10fee3e024fe88621d9e9b77c2d61",
        "SYSLOG_PID" : "781",
        "PRIORITY" : "6",
        "MESSAGE" : "INFO     : files: files passed",
        "SYSLOG_FACILITY" : "3",
        "__REALTIME_TIMESTAMP" : "1575563617516057",
        "__CURSOR" : "s=e030078e2be04702951333e34e93cfa1;i=38e;b=96c10fee3e024fe88621d9e9b77c2d61;m=826636;t=598f77dbe3a19;x=a712cd15e29>
        "__MONOTONIC_TIMESTAMP" : "8545846",
        "_HOSTNAME" : "coreos",
        "_SOURCE_MONOTONIC_TIMESTAMP" : "6402014"
}

It's getting logged twice, once via journald, and once via the kernel buffer I think because of:

ignition-dracut/dracut/99journald-conf/00-journal-log-forwarding.conf

Line 11 in f6b3c1b

ForwardToKMsg=yes

cgwalters · 2019-12-05T16:43:11Z

Hmm...this might explain why we're seeing "Missed [x] kernel messages" in the journal too - if we're in some cases logging recursively, maybe journald tries to suppress this type of cycle but can't always do it?

jlebon · 2019-12-05T16:43:42Z

So I think what's happening there is that it depends when switch root happens vs the journal reads input from the kernel buffer. If we switch root fast enough, the new journald instance which doesn't have that knob will be reading the data that the pre-switch-root journald left while forwarding.

I noticed that several of our units appeared to run twice, including `ignition-files.service` (!) and `ignition-ostree-mount-firstboot-sysroot.service`. [Later investigation revealed this was actually a double-logging issue.] Looking deeper into this, I discovered a bit of the initrd that I didn't even know existed, which is `initrd-parse-etc.service` that actually runs `ExecStart=-/usr/bin/systemctl --no-block start initrd-fs.target` *post* switch root, after services that are part of that target have already run (and then been shut down). This adds a big wrinkle to my understanding of system bootup. Then I noticed that `initrd-fs.target` has: `ConditionPathExists=/etc/initrd-release` which will be false post-switchroot. The semantics here are...strange. It seems like the goal is to ensure that the rootfs is definitely still mounted, perhaps in very complex rootfs scenarios? In our case it will still be mounted, we definitely don't want to rerun any of our Ignition units. [We discovered the units aren't actually being rerun, but this still seems like a best practice]

cgwalters · 2019-12-05T17:15:39Z

Wow, thanks for that investigation. Trying this some more, it does seem like you're right, and I was just lucky with the race when I was testing out this patch.

But...hum. Then I guess my question is why isn't that unit re-running our services? [Going to add some sanity checking that it really isn't happening]

Is the reason that ConditionPathExists=/etc/initrd-release is used in a bunch of units to simply avoid breakage if they're somehow enabled by default in the real root?

I updated the commit message with the new findings, but basically I'd argue for merging anyways since this seems like a best practice.

jlebon · 2019-12-05T17:29:16Z

I updated the commit message with the new findings, but basically I'd argue for merging anyways since this seems like a best practice.

I think I agree, though... let's just plaster it in all our units?

Is the reason that ConditionPathExists=/etc/initrd-release is used in a bunch of units to simply avoid breakage if they're somehow enabled by default in the real root?

This is a pure guess, but I think it's simply because the build system puts all the service files indiscriminately in the real root and the 01systemd-initrd module just fetches them from there.

One nice bonus of doing this is I'm pretty sure it's what allows e.g. systemctl status initd-parse-etc.service to work correctly post-switchroot. If we do this to our units too, we should be able to get that working as well.

jlebon · 2019-12-05T17:31:20Z

(I can investigate this and do that switch as a follow-up if folks agree. I think it'd be a nice tiny QOL improvement.)

cgwalters · 2019-12-05T18:00:07Z

I think I agree, though... let's just plaster it in all our units?

Yeah.

This is standard practice for units that are only meant to run in the initrd. It matches what e.g. `systemd` does for its initrd units. See also: coreos#140. I also snuck in `Documentation=` lines in there.

jlebon · 2019-12-05T22:22:51Z

#142

This is standard practice for units that are only meant to run in the initrd. It matches what e.g. `systemd` does for its initrd units. See also: #140. I also snuck in `Documentation=` lines in there.

cgwalters force-pushed the no-rerun-switchroot branch from c457e2a to 8800e42 Compare December 5, 2019 17:14

jlebon approved these changes Dec 5, 2019

View reviewed changes

cgwalters merged commit 8168e4c into coreos:master Dec 5, 2019

jlebon mentioned this pull request Dec 5, 2019

units: add ConditionPathExists=/etc/initrd-release everywhere #142

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ConditionPathExists=/etc/initrd-release to targets #140

Add ConditionPathExists=/etc/initrd-release to targets #140

cgwalters commented Dec 5, 2019

cgwalters commented Dec 5, 2019

jlebon commented Dec 5, 2019 •

edited

Loading

jlebon commented Dec 5, 2019

jlebon commented Dec 5, 2019 •

edited

Loading

jlebon commented Dec 5, 2019

cgwalters commented Dec 5, 2019

jlebon commented Dec 5, 2019

cgwalters commented Dec 5, 2019

jlebon commented Dec 5, 2019

jlebon commented Dec 5, 2019

cgwalters commented Dec 5, 2019

jlebon commented Dec 5, 2019

Add ConditionPathExists=/etc/initrd-release to targets #140

Add ConditionPathExists=/etc/initrd-release to targets #140

Conversation

cgwalters commented Dec 5, 2019

cgwalters commented Dec 5, 2019

jlebon commented Dec 5, 2019 • edited Loading

jlebon commented Dec 5, 2019

jlebon commented Dec 5, 2019 • edited Loading

jlebon commented Dec 5, 2019

cgwalters commented Dec 5, 2019

jlebon commented Dec 5, 2019

cgwalters commented Dec 5, 2019

jlebon commented Dec 5, 2019

jlebon commented Dec 5, 2019

cgwalters commented Dec 5, 2019

jlebon commented Dec 5, 2019

jlebon commented Dec 5, 2019 •

edited

Loading

jlebon commented Dec 5, 2019 •

edited

Loading