Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create_disk: Make rootfs size small even for qemu/IaaS #924

Merged
merged 1 commit into from
Nov 18, 2019

Conversation

cgwalters
Copy link
Member

For RHCOS with encryption, we encrypt in early boot, and it's beneficial
if the root filesystem size is small initially - distinct from the
"default disk" size that applies in clouds (and libvirt).

This helps RHCOS maintain its default 16G size for libvirt without
requiring the installer and other tools to start resizing it.

This also increases consistency across metal and virt - we now
always run coreos-growfs.

@cgwalters
Copy link
Member Author

cgwalters commented Nov 15, 2019

Alternative to #917 - the contrast with that one is that we don't specify the rootfs size in the config manually, we always estimate.

Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks sane to me. Will leave it open for a bit in case others prefer the other option.

@ajeddeloh
Copy link
Contributor

Why not make the disk smaller?

@cgwalters
Copy link
Member Author

Why not make the disk smaller?

Because the OpenShift installer doesn't resize libvirt currently: openshift/installer#2652

@cgwalters
Copy link
Member Author

Another reason to do this is to aid reprovisioning - it came up in our meetup that we could just copy the raw /dev/disk/by-label/root into a ramdisk. This would avoid having to manage the "serialization" half at the filesystem level; the reprovisioning code then just looks more like the ostree admin deploy phase of create_disk.sh.

@@ -90,7 +93,7 @@ case "$arch" in
-n ${BOOTPN}:0:+384M -c ${BOOTPN}:boot \
-n 2:0:+127M -c 2:EFI-SYSTEM -t 2:C12A7328-F81F-11D2-BA4B-00A0C93EC93B \
-n 3:0:+1M -c 3:BIOS-BOOT -t 3:21686148-6449-6E6F-744E-656564454649 \
-n ${ROOTPN}:0:0 -c ${ROOTPN}:root -t ${ROOTPN}:0FC63DAF-8483-4772-8E79-3D69D8477DE4
-n ${ROOTPN}:0:${rootfs_size} -c ${ROOTPN}:root -t ${ROOTPN}:0FC63DAF-8483-4772-8E79-3D69D8477DE4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, should these be prepended by a +? Otherwise from a scan of sgdisk(8), it seems like it interprets it as an offset from the start of the disk (so the partition size would actually be rootfs_size - (384 + 127 + 1)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, very nice catch! I had that in the other PR but lost it when reworking it here.
Fixed ⬇️

@jlebon
Copy link
Member

jlebon commented Nov 18, 2019

Hmm,

Could not create partition 4 from 1050624 to 5550079
Setting name!
partNum is 3
Unable to set partition 4's name to 'root'!
Could not change partition 4's type code to 0FC63DAF-8483-4772-8E79-3D69D8477DE4!
Error encountered; not saving changes.

Could the numbers be adding to something greater than $image_size somehow? Probably need to add another meg or something to 513 for partitioning overhead.

For RHCOS with encryption, we encrypt in early boot, and it's beneficial
if the root filesystem size is small initially - distinct from the
"default disk" size that applies in clouds (and libvirt).

This helps RHCOS maintain its default 16G size for libvirt without
requiring the installer and other tools to start resizing it.

This also increases consistency across metal and virt - we now
*always* run `coreos-growfs`.
@cgwalters
Copy link
Member Author

cgwalters commented Nov 18, 2019

Could the numbers be adding to something greater than $image_size somehow?

I had this also fixed in the other PR - basically what we want is for qemu/IaaS to specify a larger disk and smaller root, and for metal, we use a small image and let the root become "naturally" sized (rather than trying to constrain both the image and root size, leading to potentially wrong calculations). Fixed.

@cgwalters
Copy link
Member Author

Passing CI now!

@jlebon
Copy link
Member

jlebon commented Nov 27, 2019

OK, so one problem with this approach is that it means adding any partition after the rootfs via Ignition essentially "traps" the root partition preventing it to grow any more. So in fact the default rootfs size here is crucial (see coreos/fedora-coreos-tracker#18 (comment)).

I think what we also need though is some way to tell Ignition, e.g. "create these partitions after the last one, but align with the end of the disk". Then make fcct use that semantic by default for adding partitions. That way the remaining free space is left to the rootfs. (But still in cosa make sure that we set a good minimum value.)

@jlebon
Copy link
Member

jlebon commented Nov 27, 2019

Hmm, what would be probably better actually is being able to move the root partition to always be the last one. Which I think the rootfs redeploy work will allow, right? That way, e.g. moving to a larger disk (or resizing the virtual disk) makes it easy to grow the rootfs.

@cgwalters
Copy link
Member Author

OK, so one problem with this approach is that it means adding any partition after the rootfs via Ignition essentially "traps" the root partition preventing it to grow any more.

Right...but it's hard to avoid getting to LVM/btrfs in the general case here.

Hmm, what would be probably better actually is being able to move the root partition to always be the last one. Which I think the rootfs redeploy work will allow, right?

Yeah, but in the "add /var" path then we give all space to the root and not /var which one would want.

I guess in the /var case the admin probably wants enough space for OS updates plus some small amount of config and that'd mostly be it. In the general case we'd probably need to have the sysroot have at least double the OS size.

@jlebon
Copy link
Member

jlebon commented Dec 10, 2019

OK, so to recap:

  • no /var partition: we want / to take up all the space
  • /var partition: we want / to take up some reasonable amount of space, and otherwise give the rest to /var

Either of those cases are pretty straight-forward; we just leave the partition that we want growable to be at the end. The tricky bit is if one wants to create additional partitions. Which yeah, gets into LVM territory. Almost feels like we should issue a warning in fcct or something if additional partitions are specified and the last one isn't /var.

Anyway, I do think now we should support a rootfs-size in image.yaml so that we can hardcode the bare minimum the rootfs needs to be.

@ajeddeloh Thoughts on the above convo?

@vtolstov
Copy link

why you drop lvm for fcos? its really useful have growing one partition that can be splitted by lvm to needed pieces?

@cgwalters
Copy link
Member Author

Anyway, I do think now we should support a rootfs-size in image.yaml so that we can hardcode the bare minimum the rootfs needs to be.

Just as a sanity check? Or are you thinking this is somehow used by tools other than cosa like fcct?

@cgwalters
Copy link
Member Author

why you drop lvm for fcos? its really useful have growing one partition that can be splitted by lvm to needed pieces?

Eventually we want to support LVM; see also coreos/fedora-coreos-tracker#94

@ajeddeloh
Copy link
Contributor

ajeddeloh commented Dec 10, 2019

You can have a small rootfs that gets expanded and put a partition after it for /var with Ignition spec 2.3.0+ and 3.0.0+. That's what the partition rework allows. You can do something like:

partitions: [
{
  "number": 4,
  "startMiB": 0,
  "sizeMiB": 0,
  "label": "root",
},
{
  "number": 5,
  "startMiB": 0,
  "sizeMiB": 0,
  "label": "var"
}]

^ didn't test that, might need tweaking.

0 for start can be used multiple times. Each time it indicates the start of the largest block of free space left of the drive.

@jlebon
Copy link
Member

jlebon commented Feb 20, 2020

Anyway, I do think now we should support a rootfs-size in image.yaml so that we can hardcode the bare minimum the rootfs needs to be.

Just as a sanity check? Or are you thinking this is somehow used by tools other than cosa like fcct?

I guess what I'm not sure here is whether there's a mismatch of intent re. dynamically calculating the rootfs size. ISTM we've been mostly focused on making the "packaging" ergonomic, but in reality the number we land on here should reflect a carefully chosen value that takes into account what we think is the minimum partition size viable on the long-ish term. (Not that we can never surpass that, but having a fixed rootfs size makes it easier to have a comparison point from which we can call out such changes).

Given that, I'm thinking it'd make more sense if we had that number living in image.yaml directly? This would then determine the rootfs partition size we set here. (And maybe e.g. error out if the size estimation we get runs close to that?) An alternative approach is having a default Ignition base config that resizes the root partition to e.g. 3G that users can override as necessary.

You can have a small rootfs that gets expanded and put a partition after it for /var with Ignition spec 2.3.0+ and 3.0.0+. That's what the partition rework allows. You can do something like:

This doesn't work for me: coreos/ignition#924

@cgwalters
Copy link
Member Author

The thing that terrifies me about trying to commit to such a size is the likely never-to-end stream of things like "hey add usbguard please" combined with increasingly statically linked binaries, plus more and more kernel drivers and translations...

If we force users to pick the size hey at least we can say it was their fault 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants