dev-synthesize-osupdate: New command #1635

cgwalters · 2020-08-04T02:10:31Z

We have "real" OS update tests which look at a previous build;
this is generally good, but I want to be able to reliably
test e.g. a "large" upgrade in some CI scenarios, and it's OK
if the upgrade isn't "real".

This command takes an ostree commit and adds a note to a percentage
of ELF binaries. This way one can generate a "large" update by
specifying e.g. --percentage=80 or so.

I plan to use this for testing etcd performance during
large updates; see
openshift/machine-config-operator#1897

cgwalters · 2020-08-05T22:02:16Z

OK I reworked this to use more of a "cosa model" and less "raw ostree" and added an invocation in CI to validate that it works (or at least, doesn't crash).

lucab · 2020-08-06T10:57:08Z

I think @jlebon had similar logic to synthesize a payload for FCOS in kola, which also injects relevant commit metadata for Zincati. Is there more that can be shared/re-use between the two?

jlebon

Looks like CI failed on this.

LGTM, though IMO I think we should move away from offline-update in general now that we have proper upgrade tests. I think it'd be great to have this integrated there instead.

jlebon · 2020-08-07T21:43:45Z

Just saw your cross-link to this from openshift/machine-config-operator#1957. You have the same idea happening in ostreedev/ostree#2127 of twiddling with a percentage of ELF files. Let's just make it a hidden ostree command instead? (Or rpm-ostree; there's already testutils exactly for that). That way it's much easier for the MCO or kola tests to make use of it.

cgwalters · 2020-08-16T14:59:29Z

I think @jlebon had similar logic to synthesize a payload for FCOS in kola, which also injects relevant commit metadata for Zincati. Is there more that can be shared/re-use between the two?

Yeah I was referring to that in the first para of the commit message. They're different things - the current kola upgrade tests are going from "last fcos stable to current", but the problem with that is we're mostly testing the old code as far as upgrading. Here I really want to test the current code, and also with a "large" upgrade.

You have the same idea happening in ostreedev/ostree#2127 of twiddling with a percentage of ELF files. Let's just make it a hidden ostree command instead? (Or rpm-ostree; there's already testutils exactly for that).

It's an interesting topic because the ostree test is an exttest and so using this cosa command would currently force us to pull it as a container until #1645

I'd forgotten about rpm-ostree testutils - that is tempting (could migrate the ostree Rust code rather than this Python code) but...that has the notable downside of shipping test code to everyone's systems. It's not going to scale very far.

jlebon · 2020-08-17T16:49:18Z

Yeah I was referring to that in the first para of the commit message. They're different things - the current kola upgrade tests are going from "last fcos stable to current", but the problem with that is we're mostly testing the old code as far as upgrading. Here I really want to test the current code, and also with a "large" upgrade.

The upgrade test does both: it goes from previous to current and current to "synthesized next". There's also rpm-ostree.upgrade-rollback which only does the latter bit and is part of the regular kola testsuite. So we could use this in both tests (or we could split it into its own e.g. {fcos,rhcos}.upgrade.large upgrade test too).

I'd forgotten about rpm-ostree testutils - that is tempting (could migrate the ostree Rust code rather than this Python code) but...that has the notable downside of shipping test code to everyone's systems. It's not going to scale very far.

Yeah, I'd like to eventually have an rpm-ostree-tests subpackage or something for this. I don't think resorting to these low-level helpers will happen often though. But I'm good with having it in cosa too if you prefer! My main concern is the focus on offline-update instead of beefing up our upgrade testsuite.

Anyway, we can look into this stuff again in follow-ups. We've already got the code written here and it's useful in testing an important bug, so shipping this SGTM! Want to fix CI? I think we can just swap the order in which the commands are run.

cgwalters · 2020-08-17T16:57:27Z

My main concern is the focus on offline-update instead of beefing up our upgrade testsuite.

I don't understand, how does this relate to offline-update? The plan is to use this in OpenShift CI to just synthesize a new oscontainer, offline-update wouldn't be involved, and isn't in this PR either.

cgwalters · 2020-08-17T17:25:57Z

OK I'll look at adding this to rpm-ostree.

jlebon · 2020-08-17T18:32:14Z

I don't understand, how does this relate to offline-update? The plan is to use this in OpenShift CI to just synthesize a new oscontainer, offline-update wouldn't be involved, and isn't in this PR either.

Ahh OK, maybe I misunderstood. I thought in addition to using this in OpenShift CI, you were targeting offline-update to test updates to these new commits. I think we're on the same page then. :)

We want to test upgrades that actually change files as a general rule; in some cases we want to test "large" upgrades to validate performance. This code generates a "synthetic" upgrade that adds an ELF note to a percentage of ELF files (randomly selected). By doing it this way we are only actually testing one version of the code. Migrated from coreos/coreos-assembler#1635 using the Rust code from ostreedev/ostree#2127

cgwalters · 2020-08-17T19:41:35Z

Migrated to coreos/rpm-ostree#2199

We want to test upgrades that actually change files as a general rule; in some cases we want to test "large" upgrades to validate performance. This code generates a "synthetic" upgrade that adds an ELF note to a percentage of ELF files (randomly selected). By doing it this way we are only actually testing one version of the code. Migrated from coreos/coreos-assembler#1635 using the Rust code from ostreedev/ostree#2127

cgwalters · 2020-08-19T15:58:37Z

OK trying to use this for the OpenShift testing I belatedly realized I do really want this code, not the code that landed in rpm-ostree.

The reason is this code uses the libostree APIs and hence works in an ergonomic way on an archive mode repo without needing to check out the whole thing.

Whereas instead for the "exttest" model where the test runs in the target VM it's way simpler to just use the files that exist in /.

What we really want I think is to switch the oscontainer to be "exploded" rather than archive-repo-in-container...but I don't want to block on that.

cgwalters · 2020-08-19T16:07:16Z

I guess we could also stick this in some other container (i.e. not coreos-assembler) for now...though that has some overhead.

Or I could try to adjust the rpm-ostree code to work against an archive repo too...it's just messy.

jlebon · 2020-08-19T16:32:55Z

Keeping it in cosa for now is totally fine IMO. We can enhance the rpm-ostree code and dedupe from here and ostree later on.

/lgtm
(Though last time this had some CI issues; let's see if that still happens.)

cgwalters · 2020-08-19T21:15:49Z

Re-adding WIP I'm going to add some more code here wrapping this to deal with oscontainers too.

cgwalters · 2020-08-20T18:03:39Z

Ahh OK, maybe I misunderstood. I thought in addition to using this in OpenShift CI, you were targeting offline-update to test updates to these new commits. I think we're on the same page then. :)

Oh right now I realize why you thought that - because I have a commit to make it easier to use offline-update to test this code. I will split that to a separate PR. But I wasn't planning to use offline-update for the actual tests.

cgwalters · 2020-08-20T18:05:09Z

OK a whole lot of hacks here but I got it to work:

$ env REGISTRY_AUTH_FILE=$HOME/.docker/config.json cosa dev-synthesize-osupdatecontainer --percentage=70 --insecure localhost:5000/test-machine-os-content@sha256:343ba8c735527c539c50b8aa22cd5fc5600a875da8bb13bc7d047211ecafc664 registry.svc.ci.openshift.org/rhcos-devel/walters-test-dev-content:latest
...
/usr/bin: Modified 499/713 files, 524.8 MB
/usr/sbin: Modified 208/298 files, 47.6 MB
/usr/lib: Modified 0/0 files, 0 bytes
/usr/lib/systemd: Modified 31/45 files, 6.1 MB
/usr/lib64: Modified 234/335 files, 105.8 MB
Commit: faaecb4c9e8e195f0fc4ce5e4d53caa60fcf17e7239375c00a07af05242112b7
Metadata Total: 15
...
Pushing container
+ podman --root=/tmp/cosa-dev-synth-update0j0i6cnz/work/containers-storage --storage-driver vfs push --tls-verify --authfile=/var/home/walters/.docker/config.json registry.svc.ci.openshift.org/rhcos-devel/walters-test-dev-content:latest
...

cgwalters · 2020-08-20T18:17:37Z

Split out prep PRs:

cgwalters · 2020-08-20T21:31:18Z

Usage in yet another higher level script here:
https://github.com/cgwalters/workboard/blob/master/openshift/bz1850057-etcd-osupdate/generate-nontrivial-osupdate

We have "real" OS update tests which look at a previous build; this is generally good, but I want to be able to reliably test e.g. a "large" upgrade in some CI scenarios, and it's OK if the upgrade isn't "real". This command takes an ostree commit and adds a note to a percentage of ELF binaries. This way one can generate a "large" update by specifying e.g. `--percentage=80` or so. Building on that, also add a wrapper which generates an update from an oscontainer. I plan to use this for testing etcd performance during large updates; see openshift/machine-config-operator#1897

cgwalters · 2020-08-31T17:03:21Z

Lifting WIP on this, I dropped the attempt to test it in CI for now because there's some funky conflicts the pruning test (which seems to be a trap we need to fix).

jlebon · 2020-09-08T21:40:40Z

/lgtm

openshift-ci-robot · 2020-09-08T21:40:42Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, jlebon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [cgwalters,jlebon]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot requested review from lucab and zonggen August 4, 2020 02:10

openshift-ci-robot added the approved label Aug 4, 2020

cgwalters mentioned this pull request Aug 4, 2020

[WIP] Bug 1850057: update etcd followers first, use bfq on control plane openshift/machine-config-operator#1946

Closed

cgwalters force-pushed the synth-update branch 2 times, most recently from b4ead55 to c0fdd24 Compare August 5, 2020 21:49

jlebon approved these changes Aug 7, 2020

View reviewed changes

cgwalters mentioned this pull request Aug 7, 2020

Bug 1850057: Use bfq scheduler on control plane, idle I/O for rpm-ostreed openshift/machine-config-operator#1957

Merged

cgwalters closed this Aug 17, 2020

cgwalters mentioned this pull request Aug 17, 2020

Add testutils generate-synthetic-upgrade coreos/rpm-ostree#2199

Merged

cgwalters reopened this Aug 19, 2020

openshift-ci-robot assigned jlebon Aug 19, 2020

openshift-ci-robot added the lgtm label Aug 19, 2020

cgwalters changed the title ~~dev-synthesize-osupdate: New command~~ WIP: dev-synthesize-osupdate: New command Aug 19, 2020

openshift-ci-robot added the do-not-merge/work-in-progress label Aug 19, 2020

cgwalters force-pushed the synth-update branch from c0fdd24 to c784044 Compare August 20, 2020 18:04

openshift-ci-robot removed the lgtm label Aug 20, 2020

cgwalters force-pushed the synth-update branch from c784044 to d5e9133 Compare August 30, 2020 18:04

cgwalters force-pushed the synth-update branch from d5e9133 to be955a0 Compare August 30, 2020 21:30

cgwalters changed the title ~~WIP: dev-synthesize-osupdate: New command~~ dev-synthesize-osupdate: New command Aug 31, 2020

openshift-ci-robot removed the do-not-merge/work-in-progress label Aug 31, 2020

openshift-ci-robot added the lgtm label Sep 8, 2020

openshift-merge-robot merged commit 0718122 into coreos:master Sep 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dev-synthesize-osupdate: New command #1635

dev-synthesize-osupdate: New command #1635

cgwalters commented Aug 4, 2020

cgwalters commented Aug 5, 2020

lucab commented Aug 6, 2020

jlebon left a comment

jlebon commented Aug 7, 2020

cgwalters commented Aug 16, 2020

jlebon commented Aug 17, 2020

cgwalters commented Aug 17, 2020

cgwalters commented Aug 17, 2020

jlebon commented Aug 17, 2020

cgwalters commented Aug 17, 2020

cgwalters commented Aug 19, 2020 •

edited

Loading

cgwalters commented Aug 19, 2020

jlebon commented Aug 19, 2020

cgwalters commented Aug 19, 2020

cgwalters commented Aug 20, 2020

cgwalters commented Aug 20, 2020

cgwalters commented Aug 20, 2020

cgwalters commented Aug 20, 2020

cgwalters commented Aug 31, 2020

jlebon commented Sep 8, 2020

openshift-ci-robot commented Sep 8, 2020

dev-synthesize-osupdate: New command #1635

dev-synthesize-osupdate: New command #1635

Conversation

cgwalters commented Aug 4, 2020

cgwalters commented Aug 5, 2020

lucab commented Aug 6, 2020

jlebon left a comment

Choose a reason for hiding this comment

jlebon commented Aug 7, 2020

cgwalters commented Aug 16, 2020

jlebon commented Aug 17, 2020

cgwalters commented Aug 17, 2020

cgwalters commented Aug 17, 2020

jlebon commented Aug 17, 2020

cgwalters commented Aug 17, 2020

cgwalters commented Aug 19, 2020 • edited Loading

cgwalters commented Aug 19, 2020

jlebon commented Aug 19, 2020

cgwalters commented Aug 19, 2020

cgwalters commented Aug 20, 2020

cgwalters commented Aug 20, 2020

cgwalters commented Aug 20, 2020

cgwalters commented Aug 20, 2020

cgwalters commented Aug 31, 2020

jlebon commented Sep 8, 2020

openshift-ci-robot commented Sep 8, 2020

cgwalters commented Aug 19, 2020 •

edited

Loading