Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

platform/{metal,qemu}: use bootindex=N instead of -boot once=d #1312

Merged
merged 8 commits into from
Apr 1, 2020

Conversation

jlebon
Copy link
Member

@jlebon jlebon commented Apr 1, 2020

The -boot once=d technique only works with some bootloaders. OVMF is
not one of those. Instead, do what libvirt also does, which is to use
bootindex=N flags, which both SeaBIOS and OVMF understand. Though the
libvirt VM installation workflow goes one step further here and sets the
bootindex=1 for the CD-ROM only on first boot, then turns off the
machine, changes the boot order (and ejects the CD media entirely), then
boots it back up. Here instead, we're just relying on the primary disk
being unbootable on first boot to avoid having to stop QEMU and starting
it again with slightly different args (which is totally possible for us
to do, just more complex).

This fixes kola testiso -P --qemu-firmware=uefi (and by extension,
--qemu-native-4k too). But note UEFI PXE booting still doesn't work,
will try to dig further into that.

jlebon added 6 commits April 1, 2020 18:29
If we've injected the config, don't also add `-fw_cfg` even if the
platform supports it. This otherwise muddles what we're testing and
makes it less obvious that the config must be coming from the injected
one.
Makes it easy to double-check that we're compressing the right image
(e.g. metal4k and not metal).
When provided, use the metal4k version of the images instead of the
regular 512b one.
Let's get some basic coverage of this in cosa. Will add to the FCOS
pipeline as well.

While we're here, drop the unnecessary `cosa`; kola is cosa workdir
aware now.
Symmetrical to `--no-pxe`; skip running the live ISO test if all we want
is the PXE test. (The PXE test uses `console=ttyS0` automatically, which
makes `--console` more convenient for local testing.)
@jlebon jlebon changed the title Pr/4k uefi platform/{metal,qemu}: use bootindex=N instead of -boot once=d Apr 1, 2020
@jlebon
Copy link
Member Author

jlebon commented Apr 1, 2020

Re.

But note UEFI PXE booting still doesn't work, will try to dig further into that.

I'm playing with this patch:

diff --git a/mantle/platform/metal.go b/mantle/platform/metal.go
index 30f68399..5bd0acd9 100644
--- a/mantle/platform/metal.go
+++ b/mantle/platform/metal.go
@@ -309,7 +309,6 @@ func (inst *Install) setup(kern *kernelSetup) (*installerRun, error) {
                pxe.boottype = "pxe"
                pxe.networkdevice = "virtio-net-ccw"
                pxe.tftpipaddr = "10.0.2.2"
-               pxe.bootindex = "1"
        default:
                return nil, fmt.Errorf("Unsupported arch %s" + system.RpmArch())
        }
@@ -440,11 +439,10 @@ func (t *installerRun) completePxeSetup(kargs []string) error {

 func (t *installerRun) run() (*QemuInstance, error) {
        builder := t.builder
-       netdev := fmt.Sprintf("%s,netdev=mynet0,mac=52:54:00:12:34:56", t.pxe.networkdevice)
-       if t.pxe.bootindex == "" {
-               builder.Append("-boot", "once=n", "-option-rom", "/usr/share/qemu/pxe-rtl8139.rom")
-       } else {
-               netdev += fmt.Sprintf(",bootindex=%s", t.pxe.bootindex)
+       // re. bootindex=2 arg, see AddInstallIso comment
+       netdev := fmt.Sprintf("%s,netdev=mynet0,mac=52:54:00:12:34:56,bootindex=2", t.pxe.networkdevice)
+       if system.RpmArch() != "s390x" {
+               builder.Append("-option-rom", "/usr/share/qemu/pxe-rtl8139.rom")
        }
        builder.Append("-device", netdev)
        usernetdev := fmt.Sprintf("user,id=mynet0,tftp=%s,bootfile=%s", t.tftpdir, t.pxe.bootfile)

Though that doesn't quite work.

jlebon added 2 commits April 1, 2020 18:39
The `-boot once=d` technique only works with some bootloaders. OVMF is
not one of those. Instead, do what libvirt also does, which is to use
`bootindex=N` flags, which both SeaBIOS and OVMF understand. Though the
libvirt VM installation workflow goes one step further here and sets the
`bootindex=1` for the CD-ROM only on first boot, then turns off the
machine, changes the boot order (and ejects the CD media entirely), then
boots it back up. Here instead, we're just relying on the primary disk
being unbootable on first boot to avoid having to stop QEMU and starting
it again with slightly different args (which is totally possible for us
to do, just more complex).

This fixes `kola testiso -P --qemu-firmware=uefi` (and by extension,
`--qemu-native-4k` too). But note UEFI PXE booting still doesn't work,
will try to dig further into that.
Such as whether it was BIOS or UEFI, and whether it was the metal or the
metal4k image.
@cgwalters
Copy link
Member

Nice!

Using the bootindex does make sense to me, particularly if it's what virt-install does. It does more closely match I guess at least some physical scenarios too, where the firmware is just going to probe the primary disk and only fall back to secondary devices if that's not bootable.

That said, it is more elegant and probably more reliable to eject the CDROM as you mentioned; I think I've seen some installers have a "please eject, then press return" phase which theoretically we could support (and script?). But eh, bootindex works.

Hmm...how is the change to use this in CI working without coreos-installer support landed yet? Is it that we're just not doing autodetection and forcing the 4k image to be installed?

/lgtm

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, jlebon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 1b53134 into coreos:master Apr 1, 2020
@jlebon
Copy link
Member Author

jlebon commented Apr 2, 2020

Hmm...how is the change to use this in CI working without coreos-installer support landed yet? Is it that we're just not doing autodetection and forcing the 4k image to be installed?

Yup, exactly! One can install the 4k images already today. The patch in coreos/coreos-installer#203 only applies to installing from stream metadata. (Which is the default today, but soon won't be once coreos/coreos-installer#197 is merged. But we'll still need the equivalent auto-select logic in osmet too.)

jlebon added a commit to jlebon/fedora-coreos-pipeline that referenced this pull request Apr 2, 2020
`kola testiso` knows how to do this now!

coreos/coreos-assembler#1312
jlebon added a commit to jlebon/fedora-coreos-pipeline that referenced this pull request Apr 16, 2020
`kola testiso` knows how to do this now!

coreos/coreos-assembler#1312

While we're here, make it run in parallel with the 512b one, and use
`kola` directly instead of `cosa kola`.
jlebon added a commit to jlebon/fedora-coreos-pipeline that referenced this pull request May 20, 2020
`kola testiso` knows how to do this now!

coreos/coreos-assembler#1312

While we're here, make it run in parallel with the 512b one, and use
`kola` directly instead of `cosa kola`.
jlebon added a commit to coreos/fedora-coreos-pipeline that referenced this pull request May 25, 2020
`kola testiso` knows how to do this now!

coreos/coreos-assembler#1312

While we're here, make it run in parallel with the 512b one, and use
`kola` directly instead of `cosa kola`.
@jlebon jlebon deleted the pr/4k-uefi branch July 6, 2020 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants