Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build storage/management #159

Closed
cgwalters opened this issue Oct 10, 2018 · 39 comments
Closed

build storage/management #159

cgwalters opened this issue Oct 10, 2018 · 39 comments

Comments

@cgwalters
Copy link
Member

Today c-a defaults to handling "local development". I am working on orchestrating it via Jenkins and storing builds in S3. This issue is about the design of the latter.

Currently I have a script which pulls the builds.json and then fetches the meta.json from the previous build - basically setting up enough of a "skeleton" that c-a can start a new build.

One approach we could take is to offer this as a subcommand; something like storage-s3-pull or so?
Basically, we don't try to directly integrate this into a high level flow, just make it something that can be invoked by an orchestrating pipeline.

On a related topic, the Red Hat CoreOS builds use the oscontainer subcommand in this way right now.

@jlebon
Copy link
Member

jlebon commented Oct 10, 2018

Could we generalize this enough so that we support this type of "seeding" just over HTTP? E.g. something like coreos-assembler pull-build https://example.com/fcos and then we expect https://example.com/fcos/builds/latest/meta.json to exist. That hopefully would work with the myriad of various ways artifacts can be stored. (Though we could add magic for URLs that start with s3://..., etc... for things that require specialized access).

I wouldn't mind making this directly part of build. E.g. coreos-assembler build --from https://example.com/fcos?

@cgwalters
Copy link
Member Author

Or part of init e.g. coreos-assembler init --from s3://fedora-coreos?

But, this is only half the problem; the other half is syncing the resulting builds back out - and we need to think about how build pruning is handled.

@dustymabe
Copy link
Member

Currently I have a script which pulls the builds.json and then fetches the meta.json from the previous build - basically setting up enough of a "skeleton" that c-a can start a new build.

Can we not have some persistent storage that is used? I've been setting up a PV for /srv/. My thoughts are that we'll have a PV for /srv/ and then prune after archiving the artifacts from the build to a public location. i.e. the prune will keep the PV at approximately the same storage usage, but will leave the last build around for the next run

@cgwalters
Copy link
Member Author

I've been setting up a PV for /srv/.

I'm not sure I'd want to take that approach for the RHCOS builds - S3 handles backups/redundancy, also provides a way to download via HTTP, can be used with CloudFront etc.

Were you thinking of running a webserver that mounts the PV to serve content?

@dustymabe
Copy link
Member

Were you thinking of running a webserver that mounts the PV to serve content?

Nope. I was thinking part of the pipeline would essentially archive (upload/publish/whatever you want to call it) the results of that build to a proper location and then prune N-1 in the local /srv/ directory.

@jlebon
Copy link
Member

jlebon commented Oct 11, 2018

But, this is only half the problem; the other half is syncing the resulting builds back out - and we need to think about how build pruning is handled.

Hmm, this could get complex if "prod" pruning falls within the scope of c-a. One way I was thinking about this was that in prod situations, we only care about the latest/ output from a coreos-assembler build. Those artifacts are then pushed, promoted, and pruned by completely separate logic since it's highly dependent on the needs of the larger project.

@dustymabe
Copy link
Member

Those artifacts are then pushed, promoted, and pruned by completely separate logic

So does that align with what I was proposing to do in #159 (comment) ?

@jlebon
Copy link
Member

jlebon commented Oct 11, 2018

Kinda? I guess then we can think of the PV more like local cache instead. So it could work just fine without it (and refetch whatever minimum it needs to get the next build number right) but having it allows the fetch phase to be faster.

@cgwalters
Copy link
Member Author

the prune will keep the PV at approximately the same storage usage, but will leave the last build around for the next run

One thing I'll note here - my current S3 scripts only download the previous build's meta.json. If you want to maintain ostree history then you'll also need the commit object in the repo-build (and repo). But that's the only requirement - you don't actually need locally the previous build's .qcow2 for example.

How are you thinking of storing the ostree repo? For RHCOS where we're using oscontainers.

I guess then we can think of the PV more like local cache instead.

Right.

@cgwalters
Copy link
Member Author

(Although if you want to generate deltas, then you do need the previous tree's content)

@dustymabe
Copy link
Member

Looks like I'm back to missing github notifications again! The last two comments from colin didn't come in. <sarcasm> great! </sarcasm>

@cgwalters
How are you thinking of storing the ostree repo? For RHCOS where we're using oscontainers.

Pure OSTree repo for now.

@jlebon
Kinda? I guess then we can think of the PV more like local cache instead. So it could work just fine without it (and refetch whatever minimum it needs to get the next build number right) but having it allows the fetch phase to be faster.

Yeah it would be nice if your cache PV gets lost to say "start build from state that is at this http location", which I think is what you're trying to propose here @cgwalters ?

@cgwalters
Copy link
Member Author

Yeah it would be nice if your cache PV gets lost to say "start build from state that is at this http location", which I think is what you're trying to propose here @cgwalters ?

Yes. It's really important when dealing with multiple copies of data to have a strong model for what "owns" the data and what copy is canonical.

That problem is why my current pipeline doesn't have a PV, just a S3 bucket.

That said, there are obvious advantages to keeping around cache/ in particular. The whole idea of unified-core is to take advantage of a cache like that. And like I said if you want to do deltas, that's where having the ostree repo is necessary.

Those artifacts are then pushed, promoted, and pruned by completely separate logic since it's highly dependent on the needs of the larger project.

Hmm. I know we want to separate c-a from "pipelines" using it, but there are clearly going to be common concerns here. My thought was that "sync to/from S3" is a script that we could carry here.

A tricky aspect of this though is my current scripts use oscontainers, not bare ostree repos.

@cgwalters
Copy link
Member Author

OK I discovered the hard way that #137 (comment) added reverse.

Moving on from that though...the problem with the current logic is that it assumes that I have all the build directories and meta.jsons locally. I can do that, but it feels like what we really want is for build --skip-prune to just append the build to builds.json or so?

@jlebon
Copy link
Member

jlebon commented Oct 17, 2018

OK I discovered the hard way that #137 (comment) added reverse.

Ahh sorry about not pointing this out more in the patch!

Moving on from that though...the problem with the current logic is that it assumes that I have all the build directories and meta.jsons locally. I can do that, but it feels like what we really want is for build --skip-prune to just append the build to builds.json or so?

(I guess that'd be "prepend" now).

Hmm, yeah that's tricky. I think tying this to --skip-prune makes sense though. I do wonder if we need a higher level global var/stamp file instead for the "partially synced" state that we can key off from overall.

@cgwalters
Copy link
Member Author

#173

@jlebon
Copy link
Member

jlebon commented Dec 7, 2018

Related: coreos/rpm-ostree#1704

@jlebon
Copy link
Member

jlebon commented Apr 29, 2019

Been thinking about this again now that I'm working on adding S3 capabilities to the FCOS pipeline.

One thing I'm thinking is whether cosa should in fact consider the repo/ a purely "workdir" concept, possibly even moving it under cache/, and instead include a tarball of the oscontainer in the build dir. Right now, there's this odd split of what constitutes a build: it's builds/$id + the full OSTree commit in the repo. Putting it all under builds/$id means it now fully describes the build.

This also resolves the major issue of pruning. With the current scheme, pruning the OSTree repo is problematic:

  • For example, see prune_builds: right now pruning essentially does "find the oldest build we need to keep, then prune everything else older than that". This doesn't play well with tagging, since we're then keeping all the builds since the oldest tag. We could enhance the OSTree pruning API for this, but it's never going to be as clean as rm -rf builds/$id.
  • In a prod context, one has to sync the whole repo to be able to prune it with coreos-assembler prune. Which is not realistic if you have an ever-growing list of builds and have limited cache storage (related: Don't require rsync'ing everything to build and prune fedora-coreos-pipeline#38). Again, this is something we could solve with more complex code, but which would just melt away if build dirs were self-contained (essentially, one would just need to sync builds.json & the meta.json files to know which build dirs to nuke from the remote/S3).
  • Related to the above, it'd be much easier in general for any higher-level pruning code outside cosa to interact with directories & JSON files rather than OSTree repos.

Note this is completely independent of how updates are actually pushed out to client machines. We can use oscontainers to store OSTree content but still unpack it and ostree pull-local into the prod repo (i.e. coreos-assembler oscontainer extract). Deltas could also be calculated at that time.

Also, local workdirs would still have an archive repo for e.g. testing upgrades and inspecting the compose.

The main downside of course is space efficiency. But in the devel case, we cap at 3 builds by default, and in the prod case, we upload in e.g. S3, where 600M more is not really an issue. (And of course, space is still saved in the prod repo itself).

@cgwalters
Copy link
Member Author

and instead include a tarball of the oscontainer in the build dir.

Tarball of ostree repo or actually a container image tarball with ostree repo inside? I'd lean towards the former.

I think the main tricky thing here is whether we try to preserve ostree history - do the commits have parents? Maybe it's simplest to not have parents. If we go that route...perhaps e.g. rpm-ostree should learn how to parse cosa builds so rpm-ostree deploy works?

@cgwalters
Copy link
Member Author

Note this is completely independent of how updates are actually pushed out to client machines. We can use oscontainers to store OSTree content but still unpack it and ostree pull-local into the prod repo (i.e. coreos-assembler oscontainer extract). Deltas could also be calculated at that time.

Ah...so are you thinking that devel builds are promoted to prod? And that'd include both ostree and images naturally?

@jlebon
Copy link
Member

jlebon commented Apr 30, 2019

Tarball of ostree repo or actually a container image tarball with ostree repo inside? I'd lean towards the former.

Yeah, that's fine. My initial thought was oscontainer since we have code that exists today for this. But yeah, we should definitely discuss bundling formats if we go this way.

Ah...so are you thinking that devel builds are promoted to prod? And that'd include both ostree and images naturally?

OK right, let's tie this now with the current plans for FCOS stream tooling.

First, specifically to your question, promotions imply rebuilds right now (i.e. promoting from testing to stable means doing some kind of custom git merge and then triggering a new stable build).

The way FCOS is shaping up, we'll be storing build artifacts in S3, but we'll be dealing with two OSTree repos, the main prod repo (at https://ostree.fedoraproject.org/) for the prod refs and an "annex" repo for the mechanical & devel refs.

There are 9 separate streams in total (let's not bring multi-arch into this yet..). A reasonable assumption here is that we want to be able to execute builds on those streams concurrently. This would imply e.g. 9 separate cosa "build root" dirs in the bucket, each with their own builds/ dir.

From a multi-stream perspective, having a separate ostree/ repo per stream doesn't really make sense. It's much easier to manage and interact with fewer repos that hold multiple refs. This in itself is good motivation for keeping OSTree content in the build dir.

My strawman right now is:

  • cosa build tarballs OSTree content into build dirs
  • each stream has a separate build dir in the bucket (e.g. s3://fcos-builds/streams/$stream/builds/$buildid)
  • service watches for new builds across non-prod streams and pulls in new OSTree content into the annex repo
  • service watches for new builds across prod streams and pulls in new OSTree content into the prod repo

OSTree repos and build dirs can then be pruned according to different policies, which I think makes sense since one is about first installs, while the other is about upgrades. (E.g. if Cincinnati uses the OSTree repo to build its graph, then it would make sense to keep OSTree content for much longer).

We definitely lose on network efficiency here by downloading the full tarball to update the ref even if just a few files changed. I think that tradeoff is worth it though.

@jlebon
Copy link
Member

jlebon commented Apr 30, 2019

service watches for new builds across prod streams and pulls in new OSTree content into the prod repo

This might be overly simplistic. We may want to gate this on the release process instead of making it automatic.

I think the main tricky thing here is whether we try to preserve ostree history - do the commits have parents? Maybe it's simplest to not have parents. If we go that route...perhaps e.g. rpm-ostree should learn how to parse cosa builds so rpm-ostree deploy works?

Hmm, that's an interesting question. Another casualty of not preserving history, apart from ostree log and rpm-ostree deploy is that it might also make pruning more complicated.

I think I agree though that it's cleaner for OSTree commits cosa creates to not have parents. E.g. we might recompose a prod stream multiple times and not necessarily publish all the commits to prod.

One thing we could do is "graft" the commit onto the ref, preserving history, as part of the service that syncs OSTree content? We wouldn't have the same commit checksum, but still the same content checksum.

jlebon added a commit to jlebon/rpm-ostree that referenced this issue Apr 30, 2019
This is useful for tracking OSTree content across a pipeline.

See related discussions in
coreos/coreos-assembler#159.
rh-atomic-bot pushed a commit to coreos/rpm-ostree that referenced this issue Apr 30, 2019
This is useful for tracking OSTree content across a pipeline.

See related discussions in
coreos/coreos-assembler#159.

Closes: #1822
Approved by: cgwalters
@cgwalters
Copy link
Member Author

My initial thought was oscontainer since we have code that exists today for this

The other option is rojig...one powerful advantage of that is that it can easily be used again as input to a build to regenerate it (possibly with some targeted changes).

@cgwalters
Copy link
Member Author

Also on this topic personally I've been playing with https://git-annex.branchable.com/ a lot lately - one thing to consider that could make a lot of sense is to commit cosa builds into it - if we included the input RPMs (and to follow on the previous comment, the rojig rpm) we'd have everything nicely versioned. It also gives us an abstraction layer that e.g. supports syncing to s3, but also other backends.

jlebon added a commit to jlebon/coreos-assembler that referenced this issue May 17, 2019
Rather than keeping OSTree data separately in the toplevel `repo/`, make
it part of the build directory. This solves a bunch of issues and makes
things conceptually clearer.

See discussions in:
coreos#159
jlebon added a commit to jlebon/coreos-assembler that referenced this issue May 17, 2019
Rather than keeping OSTree data separately in the toplevel `repo/`, make
it part of the build directory. This solves a bunch of issues and makes
things conceptually clearer.

See discussions in:
coreos#159
jlebon added a commit to jlebon/coreos-assembler that referenced this issue May 19, 2019
Rather than keeping OSTree data separately in the toplevel `repo/`, make
it part of the build directory. This solves a bunch of issues and makes
things conceptually clearer.

See discussions in:
coreos#159
jlebon added a commit to jlebon/coreos-assembler that referenced this issue May 21, 2019
Rather than keeping OSTree data separately in the toplevel `repo/`, make
it part of the build directory. This solves a bunch of issues and makes
things conceptually clearer.

See discussions in:
coreos#159
jlebon added a commit to jlebon/coreos-assembler that referenced this issue May 22, 2019
Rather than keeping OSTree data separately in the toplevel `repo/`, make
it part of the build directory. This solves a bunch of issues and makes
things conceptually clearer.

See discussions in:
coreos#159
jlebon added a commit to jlebon/coreos-assembler that referenced this issue May 23, 2019
Rather than keeping OSTree data separately in the toplevel `repo/`, make
it part of the build directory. This solves a bunch of issues and makes
things conceptually clearer.

See discussions in:
coreos#159
jlebon added a commit to jlebon/coreos-assembler that referenced this issue May 23, 2019
Rather than keeping OSTree data separately in the toplevel `repo/`, make
it part of the build directory. This solves a bunch of issues and makes
things conceptually clearer.

See discussions in:
coreos#159
jlebon added a commit to jlebon/coreos-assembler that referenced this issue May 23, 2019
Rather than keeping OSTree data separately in the toplevel `repo/`, make
it part of the build directory. This solves a bunch of issues and makes
things conceptually clearer.

See discussions in:
coreos#159
jlebon added a commit to jlebon/coreos-assembler that referenced this issue May 24, 2019
Rather than keeping OSTree data separately in the toplevel `repo/`, make
it part of the build directory. This solves a bunch of issues and makes
things conceptually clearer.

See discussions in:
coreos#159
dustymabe pushed a commit that referenced this issue May 24, 2019
Rather than keeping OSTree data separately in the toplevel `repo/`, make
it part of the build directory. This solves a bunch of issues and makes
things conceptually clearer.

See discussions in:
#159
@bgilbert
Copy link
Contributor

@cgwalters I've used git-annex for a few years and I like it, but it's a complex external dependency that could also be unpleasant to automate.

@jlebon
Copy link
Member

jlebon commented Jul 17, 2019

I think the main tricky thing here is whether we try to preserve ostree history - do the commits have parents? Maybe it's simplest to not have parents. If we go that route...perhaps e.g. rpm-ostree should learn how to parse cosa builds so rpm-ostree deploy works?

Hmm, that's an interesting question. Another casualty of not preserving history, apart from ostree log and rpm-ostree deploy is that it might also make pruning more complicated.

I think I agree though that it's cleaner for OSTree commits cosa creates to not have parents. E.g. we might recompose a prod stream multiple times and not necessarily publish all the commits to prod.

Some more thoughts about this.

While I definitely like the idea conceptually behind keeping cosa OSTree commits independent, I think there's a lot of friction in moving away from maintaining OSTree history. We mentioned above some casualties: ostree log, rpm-ostree deploy, and ostree prune. I'll just go into some details on those to give more context.

If we deliver independent OSTree commits, then the OSTree ref will always point at the latest commit only. This in turn means that for Zincati to be able to safely upgrade hosts, it will need to use e.g. rpm-ostree rebase fedora:<SHA256> instead of deploy <SHA256> (which by default ensures that the new commit is on the same branch). And this in turn means that rpm-ostree status no longer shows the ref the system is on, but rather just the SHA256 (and version, which to be fair is how it is in RHCOS today). But this also means that a manual rpm-ostree upgrade would no longer work (which is irrelevant in RHCOS but not FCOS).

As for pruning, any commit older than the latest one will be "orphaned", which means that the default ostree prune --refs-only will delete them. So we would have to enhance ostree prune so it can take e.g. a set of protected commits... awkward, and prone to mishaps.

One thought I had on those two issues is that we could use the ref-binding work in OSTree. This is something we can do because we always rebuild on promotions. So e.g. deploy <SHA256> could learn to accept the commit if it has a matching ref binding. Similarly, ostree prune could learn a --protect-ref-binding which just omits commits with a given ref binding. deploy <VERSION> and ostree log would still be broken though.

One thing we could do is "graft" the commit onto the ref, preserving history, as part of the service that syncs OSTree content? We wouldn't have the same commit checksum, but still the same content checksum.

The issue with "grafting" is that we're not just delivering OSTrees, we're delivering whole OS images with OSTree commits embedded in them (and then signing those, see coreos/fedora-coreos-tracker#200 (comment)). So any discussion around a grafting strategy needs to address this.

At the same time, I don't want to go down the path of FAH, where when releasing an OSTree, we also "release" (make public) all the intermediate OSTree commits since the last release. This is essentially implementation details leaking into our release process.

For FCOS, we could improve greatly on this though by explicitly passing the last released build to cosa build so that all builds have for parent the latest release. So the builds are still independent from each other, but just not from the release process (which is already the case when you think about e.g. streams, versioning, and promotions).


So my conclusion on this is that while we could fully move away from maintaining OSTree history, it will require some amount of non-trivial work. But we need a solution for right now (i.e. for the next FCOS build we want to release). My suggestion is to enhance cosa build as mentioned above, while we evaluate (1) whether this is something that we want to do, and (2) how we want to rework our tools to do it.

I think if we do it right, it could turn out really well. (E.g. a completely different way is abstracting away the OSTree repo and going along Colin's suggestion to make rpm-ostree aware of cosa (or rather FCOS) metadata. The result could be a richer, more meaningful UX).

jlebon added a commit to jlebon/coreos-assembler that referenced this issue Jul 18, 2019
We want to have full control over the parent of an OSTree commit so it
can be driven at a higher level. IOW, cosa as used in a CI/CD setup is
not in a position to know what the parent of the commit should be. So
either it should default to *no parent*, or accept an override for a
specific parent at build time.

This will be used by FCOS at least for the time being. See:
coreos#159 (comment)
@jlebon
Copy link
Member

jlebon commented Jul 18, 2019

OK, I've put up #625.

jlebon added a commit to jlebon/fedora-coreos-pipeline that referenced this issue Jul 19, 2019
We want to maintain OSTree history between our releases. To do this, we
fetch the latest release from the release index, and pass it to cosa
through the `--parent` switch.

For more information, see:
coreos/coreos-assembler#159 (comment)
jlebon added a commit that referenced this issue Jul 19, 2019
We want to have full control over the parent of an OSTree commit so it
can be driven at a higher level. IOW, cosa as used in a CI/CD setup is
not in a position to know what the parent of the commit should be. So
either it should default to *no parent*, or accept an override for a
specific parent at build time.

This will be used by FCOS at least for the time being. See:
#159 (comment)
jlebon added a commit to jlebon/fedora-coreos-pipeline that referenced this issue Jul 23, 2019
We want to maintain OSTree history between our releases. To do this, we
fetch the latest release from the release index, and pass it to cosa
through the `--parent` switch.

For more information, see:
coreos/coreos-assembler#159 (comment)
jlebon added a commit to jlebon/fedora-coreos-pipeline that referenced this issue Jul 23, 2019
We want to maintain OSTree history between our releases. To do this, we
fetch the latest release from the release index, and pass it to cosa
through the `--parent` switch.

For more information, see:
coreos/coreos-assembler#159 (comment)
jlebon added a commit to jlebon/fedora-coreos-pipeline that referenced this issue Jul 24, 2019
We want to maintain OSTree history between our releases. To do this, we
fetch the latest release from the release index, and pass it to cosa
through the `--parent` switch.

For more information, see:
coreos/coreos-assembler#159 (comment)
jlebon added a commit to jlebon/fedora-coreos-pipeline that referenced this issue Jul 24, 2019
We want to maintain OSTree history between our releases. To do this, we
fetch the latest release from the release index, and pass it to cosa
through the `--parent` switch.

For more information, see:
coreos/coreos-assembler#159 (comment)
jlebon added a commit to coreos/fedora-coreos-pipeline that referenced this issue Jul 24, 2019
We want to maintain OSTree history between our releases. To do this, we
fetch the latest release from the release index, and pass it to cosa
through the `--parent` switch.

For more information, see:
coreos/coreos-assembler#159 (comment)
@dustymabe
Copy link
Member

At the same time, I don't want to go down the path of FAH, where when releasing an OSTree, we also "release" (make public) all the intermediate OSTree commits since the last release. This is essentially implementation details leaking into our release process.

+1 - we just never got around to making that cleaner

For FCOS, we could improve greatly on this though by explicitly passing the last released build to cosa build so that all builds have for parent the latest release. So the builds are still independent from each other, but just not from the release process (which is already the case when you think about e.g. streams, versioning, and promotions).

+100 - I really like that

@jlebon
Copy link
Member

jlebon commented Sep 19, 2019

Feels like we can probably close this issue at this point?

@jlebon
Copy link
Member

jlebon commented Sep 21, 2020

Closing this as per last comment.

@jlebon jlebon closed this as completed Sep 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants