-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build storage/management #159
Comments
Could we generalize this enough so that we support this type of "seeding" just over HTTP? E.g. something like I wouldn't mind making this directly part of |
Or part of But, this is only half the problem; the other half is syncing the resulting builds back out - and we need to think about how build pruning is handled. |
Can we not have some persistent storage that is used? I've been setting up a PV for |
I'm not sure I'd want to take that approach for the RHCOS builds - S3 handles backups/redundancy, also provides a way to download via HTTP, can be used with CloudFront etc. Were you thinking of running a webserver that mounts the PV to serve content? |
Nope. I was thinking part of the pipeline would essentially archive (upload/publish/whatever you want to call it) the results of that build to a proper location and then prune N-1 in the local |
Hmm, this could get complex if "prod" pruning falls within the scope of c-a. One way I was thinking about this was that in prod situations, we only care about the |
So does that align with what I was proposing to do in #159 (comment) ? |
Kinda? I guess then we can think of the PV more like local cache instead. So it could work just fine without it (and refetch whatever minimum it needs to get the next build number right) but having it allows the |
One thing I'll note here - my current S3 scripts only download the previous build's How are you thinking of storing the ostree repo? For RHCOS where we're using oscontainers.
Right. |
(Although if you want to generate deltas, then you do need the previous tree's content) |
Looks like I'm back to missing github notifications again! The last two comments from colin didn't come in.
Pure OSTree repo for now.
Yeah it would be nice if your cache PV gets lost to say "start build from state that is at this http location", which I think is what you're trying to propose here @cgwalters ? |
Yes. It's really important when dealing with multiple copies of data to have a strong model for what "owns" the data and what copy is canonical. That problem is why my current pipeline doesn't have a PV, just a S3 bucket. That said, there are obvious advantages to keeping around
Hmm. I know we want to separate c-a from "pipelines" using it, but there are clearly going to be common concerns here. My thought was that "sync to/from S3" is a script that we could carry here. A tricky aspect of this though is my current scripts use oscontainers, not bare ostree repos. |
OK I discovered the hard way that #137 (comment) added Moving on from that though...the problem with the current logic is that it assumes that I have all the build directories and |
Ahh sorry about not pointing this out more in the patch!
(I guess that'd be "prepend" now). Hmm, yeah that's tricky. I think tying this to |
Related: coreos/rpm-ostree#1704 |
Been thinking about this again now that I'm working on adding S3 capabilities to the FCOS pipeline. One thing I'm thinking is whether cosa should in fact consider the This also resolves the major issue of pruning. With the current scheme, pruning the OSTree repo is problematic:
Note this is completely independent of how updates are actually pushed out to client machines. We can use oscontainers to store OSTree content but still unpack it and Also, local workdirs would still have an archive repo for e.g. testing upgrades and inspecting the compose. The main downside of course is space efficiency. But in the devel case, we cap at 3 builds by default, and in the prod case, we upload in e.g. S3, where 600M more is not really an issue. (And of course, space is still saved in the prod repo itself). |
Tarball of ostree repo or actually a container image tarball with ostree repo inside? I'd lean towards the former. I think the main tricky thing here is whether we try to preserve ostree history - do the commits have parents? Maybe it's simplest to not have parents. If we go that route...perhaps e.g. rpm-ostree should learn how to parse cosa builds so |
Ah...so are you thinking that devel builds are promoted to prod? And that'd include both ostree and images naturally? |
Yeah, that's fine. My initial thought was oscontainer since we have code that exists today for this. But yeah, we should definitely discuss bundling formats if we go this way.
OK right, let's tie this now with the current plans for FCOS stream tooling. First, specifically to your question, promotions imply rebuilds right now (i.e. promoting from The way FCOS is shaping up, we'll be storing build artifacts in S3, but we'll be dealing with two OSTree repos, the main prod repo (at https://ostree.fedoraproject.org/) for the prod refs and an "annex" repo for the mechanical & devel refs. There are 9 separate streams in total (let's not bring multi-arch into this yet..). A reasonable assumption here is that we want to be able to execute builds on those streams concurrently. This would imply e.g. 9 separate cosa "build root" dirs in the bucket, each with their own From a multi-stream perspective, having a separate My strawman right now is:
OSTree repos and build dirs can then be pruned according to different policies, which I think makes sense since one is about first installs, while the other is about upgrades. (E.g. if Cincinnati uses the OSTree repo to build its graph, then it would make sense to keep OSTree content for much longer). We definitely lose on network efficiency here by downloading the full tarball to update the ref even if just a few files changed. I think that tradeoff is worth it though. |
This might be overly simplistic. We may want to gate this on the release process instead of making it automatic.
Hmm, that's an interesting question. Another casualty of not preserving history, apart from I think I agree though that it's cleaner for OSTree commits cosa creates to not have parents. E.g. we might recompose a prod stream multiple times and not necessarily publish all the commits to prod. One thing we could do is "graft" the commit onto the ref, preserving history, as part of the service that syncs OSTree content? We wouldn't have the same commit checksum, but still the same content checksum. |
This is useful for tracking OSTree content across a pipeline. See related discussions in coreos/coreos-assembler#159.
This is useful for tracking OSTree content across a pipeline. See related discussions in coreos/coreos-assembler#159. Closes: #1822 Approved by: cgwalters
The other option is rojig...one powerful advantage of that is that it can easily be used again as input to a build to regenerate it (possibly with some targeted changes). |
Also on this topic personally I've been playing with https://git-annex.branchable.com/ a lot lately - one thing to consider that could make a lot of sense is to commit cosa builds into it - if we included the input RPMs (and to follow on the previous comment, the rojig rpm) we'd have everything nicely versioned. It also gives us an abstraction layer that e.g. supports syncing to s3, but also other backends. |
Rather than keeping OSTree data separately in the toplevel `repo/`, make it part of the build directory. This solves a bunch of issues and makes things conceptually clearer. See discussions in: coreos#159
Rather than keeping OSTree data separately in the toplevel `repo/`, make it part of the build directory. This solves a bunch of issues and makes things conceptually clearer. See discussions in: coreos#159
Rather than keeping OSTree data separately in the toplevel `repo/`, make it part of the build directory. This solves a bunch of issues and makes things conceptually clearer. See discussions in: coreos#159
Rather than keeping OSTree data separately in the toplevel `repo/`, make it part of the build directory. This solves a bunch of issues and makes things conceptually clearer. See discussions in: coreos#159
Rather than keeping OSTree data separately in the toplevel `repo/`, make it part of the build directory. This solves a bunch of issues and makes things conceptually clearer. See discussions in: coreos#159
Rather than keeping OSTree data separately in the toplevel `repo/`, make it part of the build directory. This solves a bunch of issues and makes things conceptually clearer. See discussions in: coreos#159
Rather than keeping OSTree data separately in the toplevel `repo/`, make it part of the build directory. This solves a bunch of issues and makes things conceptually clearer. See discussions in: coreos#159
Rather than keeping OSTree data separately in the toplevel `repo/`, make it part of the build directory. This solves a bunch of issues and makes things conceptually clearer. See discussions in: coreos#159
Rather than keeping OSTree data separately in the toplevel `repo/`, make it part of the build directory. This solves a bunch of issues and makes things conceptually clearer. See discussions in: coreos#159
Rather than keeping OSTree data separately in the toplevel `repo/`, make it part of the build directory. This solves a bunch of issues and makes things conceptually clearer. See discussions in: #159
@cgwalters I've used git-annex for a few years and I like it, but it's a complex external dependency that could also be unpleasant to automate. |
Some more thoughts about this. While I definitely like the idea conceptually behind keeping cosa OSTree commits independent, I think there's a lot of friction in moving away from maintaining OSTree history. We mentioned above some casualties: If we deliver independent OSTree commits, then the OSTree ref will always point at the latest commit only. This in turn means that for Zincati to be able to safely upgrade hosts, it will need to use e.g. As for pruning, any commit older than the latest one will be "orphaned", which means that the default One thought I had on those two issues is that we could use the ref-binding work in OSTree. This is something we can do because we always rebuild on promotions. So e.g.
The issue with "grafting" is that we're not just delivering OSTrees, we're delivering whole OS images with OSTree commits embedded in them (and then signing those, see coreos/fedora-coreos-tracker#200 (comment)). So any discussion around a grafting strategy needs to address this. At the same time, I don't want to go down the path of FAH, where when releasing an OSTree, we also "release" (make public) all the intermediate OSTree commits since the last release. This is essentially implementation details leaking into our release process. For FCOS, we could improve greatly on this though by explicitly passing the last released build to So my conclusion on this is that while we could fully move away from maintaining OSTree history, it will require some amount of non-trivial work. But we need a solution for right now (i.e. for the next FCOS build we want to release). My suggestion is to enhance I think if we do it right, it could turn out really well. (E.g. a completely different way is abstracting away the OSTree repo and going along Colin's suggestion to make rpm-ostree aware of cosa (or rather FCOS) metadata. The result could be a richer, more meaningful UX). |
We want to have full control over the parent of an OSTree commit so it can be driven at a higher level. IOW, cosa as used in a CI/CD setup is not in a position to know what the parent of the commit should be. So either it should default to *no parent*, or accept an override for a specific parent at build time. This will be used by FCOS at least for the time being. See: coreos#159 (comment)
OK, I've put up #625. |
We want to maintain OSTree history between our releases. To do this, we fetch the latest release from the release index, and pass it to cosa through the `--parent` switch. For more information, see: coreos/coreos-assembler#159 (comment)
We want to have full control over the parent of an OSTree commit so it can be driven at a higher level. IOW, cosa as used in a CI/CD setup is not in a position to know what the parent of the commit should be. So either it should default to *no parent*, or accept an override for a specific parent at build time. This will be used by FCOS at least for the time being. See: #159 (comment)
We want to maintain OSTree history between our releases. To do this, we fetch the latest release from the release index, and pass it to cosa through the `--parent` switch. For more information, see: coreos/coreos-assembler#159 (comment)
We want to maintain OSTree history between our releases. To do this, we fetch the latest release from the release index, and pass it to cosa through the `--parent` switch. For more information, see: coreos/coreos-assembler#159 (comment)
We want to maintain OSTree history between our releases. To do this, we fetch the latest release from the release index, and pass it to cosa through the `--parent` switch. For more information, see: coreos/coreos-assembler#159 (comment)
We want to maintain OSTree history between our releases. To do this, we fetch the latest release from the release index, and pass it to cosa through the `--parent` switch. For more information, see: coreos/coreos-assembler#159 (comment)
We want to maintain OSTree history between our releases. To do this, we fetch the latest release from the release index, and pass it to cosa through the `--parent` switch. For more information, see: coreos/coreos-assembler#159 (comment)
+1 - we just never got around to making that cleaner
+100 - I really like that |
Feels like we can probably close this issue at this point? |
Closing this as per last comment. |
Today c-a defaults to handling "local development". I am working on orchestrating it via Jenkins and storing builds in S3. This issue is about the design of the latter.
Currently I have a script which pulls the
builds.json
and then fetches themeta.json
from the previous build - basically setting up enough of a "skeleton" that c-a can start a new build.One approach we could take is to offer this as a subcommand; something like
storage-s3-pull
or so?Basically, we don't try to directly integrate this into a high level flow, just make it something that can be invoked by an orchestrating pipeline.
On a related topic, the Red Hat CoreOS builds use the
oscontainer
subcommand in this way right now.The text was updated successfully, but these errors were encountered: