Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copy/multiple: use more flexible EditInstances instead of UpdateInstances #1883

Merged
merged 1 commit into from
Jun 3, 2023

Conversation

flouthoc
Copy link
Contributor

@flouthoc flouthoc commented Mar 13, 2023

After #1896 c/image now supports new API EditInstances which is more flexible in terms of Adding new instances or Modifying existing instances so lets use that.

@flouthoc
Copy link
Contributor Author

@mtrmac I was thinking a split like this, I will add unit tests for this new functions once the idea looks good to you, although I have a doubt if this is created additional loop.

cc @giuseppe @vrothberg

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

The design of this is fairly strongly related to the design of the expected future internal/manifest.List.EditInstances.

So let’s think what that would look like.

(It might even make sense to prototype that — there’s some chance that we don’t need a package-private instanceCopy struct at all, that we can just use a partially-filled input to the EditInstances call. But, anyway, this is all internal, so we can change that data structure in when working on EditInstances easily enough.)


One option is for the EditInstances operation to accept “the desired end state”, i.e. the list of images that the resulting manifest list should contain, and how to get them.

Another option is for the EditInstances operation to accept “updates”, i.e. what to replace, what to add, what to delete.

The differences are, I think:

  • Primarily, how deletes are represented. With “end state”, the caller must make a list of all remaining instances, with “updates”, it’s one entry
  • How instance ordering / preference is handled. Currently it is possible for a manifest list to contain several possible “matching” instances (compare the recent work on ChooseInstance and all that logic), and the generic copy code probably doesn’t understand everything relevant about the ordering / preference logic (in particular, I’m not sure we want to hard-code “earlier match is preferred” here in the generic code.) With “end state”, the caller might end up being responsible for the correct order when adding an instance, with “updates”, it’s the callee.

It seems to me that these suggest that the “updates” model is probably better; it potentially allows calling EditInstances in something like buildah manifest without involving c/image/copy at all, while still benefiting from the full ordering knowledge.

Does that make sense to you? (I didn’t actually read how buildah manifest / podman manifest works in detail. Maybe I should do that first?)


Now… the “updates” data is something we can build, but we don’t have a good way to consume it in the existing UpdateInstances; so, actually, maybe we should build some subset of the new List.EditInstances first?

Alternatively, we could build a minimal/restricted version of the consumption of the updates here in this PR, and move it into EditInstances later:

updates := map(instanceDigests, updatedList.Instance) // preserve by default
for _, edit := range decidedUpdateEdits {
    switch edit {
    case Replace:
        i := index(updates, update.sourceDigest); assert i != nil
        updates[i].Digest = updatedDigest
        // Update other fields…
    }
}

(The detailed review comments are probably premature.)

copy/multiple.go Outdated
imgspecv1 "github.com/opencontainers/image-spec/specs-go/v1"
"github.com/sirupsen/logrus"
"golang.org/x/exp/slices"
)

type CopyOperation int
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enum, as well as the values, should be package-private (or in some internal package).

Also name it something more specific — everything in this subpackage is related to copies. Maybe instanceOperation?

The enum values should probably share some common prefix, not just generic ignore.

copy/multiple.go Outdated
!slices.Contains(options.Instances, instanceDigest) {
update, err := updatedList.Instance(instanceDigest)
for i, copyInstance := range copyInstances {
if copyInstance.op == Ignore {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a switch, with a default that fails with by returning an internal error.

copy/multiple.go Outdated
Size: int64(len(updatedManifest)),
MediaType: updatedManifestType,
if copyInstance.op == CopyWithoutModification {
logrus.Debugf("Copying instance %s (%d/%d)", copyInstance.digest, i+1, len(copyInstances))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second parameter should be imagesToCopy I think

copy/multiple.go Outdated
Comment on lines 25 to 28
CopyWithoutModification
//TODO: following states must be implemented
//CopyWithModification
//CopyAndReplicate
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure what is the difference between these three. (Notably the current copy is not always “without modification” either).

copy/multiple.go Outdated
imgspecv1 "github.com/opencontainers/image-spec/specs-go/v1"
"github.com/sirupsen/logrus"
"golang.org/x/exp/slices"
)

type CopyOperation int

const (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self:

Things we might eventually need to represent:

  • Preserve an entry, don’t change it in any way, don’t copy the instance
  • Copy the instance (with some, possibly empty, modification properties), update an existing entry
  • Copy the instance (with some, possibly empty, modification properties), add a new entry
  • Completely add a new pre-existing instance??? (Does buildah manifest need that?)
  • Delete an instance? (Does buildah manifest need that?)

copy/multiple.go Outdated
Comment on lines 41 to 43
if options.ImageListSelection == CopySpecificImages {
imagesToCopy = len(options.Instances)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I wouldn’t mind counting the items one by one as the relevant operations are chosen, or possibly even doing one more iteration over the loop just to count.; I think we’ll end up doing one of these a future version anyway, and it’s more obviously correct without having to think — so we might as well start now.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this is relevent anymore EditInstances directly take list of updates which can be directly updated.

copy/multiple.go Outdated Show resolved Hide resolved
@flouthoc flouthoc force-pushed the refactor-copy-multiple-images branch 2 times, most recently from 941cd13 to e8bd674 Compare March 18, 2023 11:54
@flouthoc
Copy link
Contributor Author

flouthoc commented Mar 20, 2023

@mtrmac Should I create PR for EditInstances first and then we should come back on this ? WDYT ? Although I made some changes in this current PR as well.

@mtrmac
Copy link
Collaborator

mtrmac commented Mar 20, 2023

Sure, the copy refactor and adding EditInstances are conceptually separate but they need to be at least designed concurrently. So working on EditInstances SGTM. Or possibly do the two in one PR, although I think smaller PRs make it easier to keep track of discussions.

(The PRs also don’t need to be perfectly clean — it’s fine to refactor copy here, and then in the EditInstances one, to adjust copy further. It just seems like an unnecessary effort to design/implement one approach first, including the tests, and then to completely abandon that and to go in a different direction in the next PR.)

Ultimately this is up to you — I think the smaller PRs are easier, but you might have a different opinion on that.

@mtrmac mtrmac added the kind/feature A request for, or a PR adding, new functionality label Apr 28, 2023
@flouthoc
Copy link
Contributor Author

flouthoc commented Jun 1, 2023

It seems #1896 is about to get merged, i am getting back to this.

@flouthoc flouthoc force-pushed the refactor-copy-multiple-images branch from e8bd674 to e7a0571 Compare June 2, 2023 06:02
@flouthoc flouthoc changed the title copy/multiple: split how images are selected for copy copy/multiple: use more flexible EditInstances instead of UpdateInstances Jun 2, 2023
@flouthoc flouthoc requested a review from mtrmac June 2, 2023 06:04
@flouthoc
Copy link
Contributor Author

flouthoc commented Jun 2, 2023

@mtrmac I think after #1896 my old approach is not even valid anymore since EditInstances directly supports ListOperation which does everything we need.

I think we will just need another case which can handle case for ListOpAdd for replicating zstd images but that needs to be done later on. Do you think we need anything else in this PR ?

Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original PR was intended to split the logic deciding what to copy (which will become a bit more complex with the need to see which instances need a new copy added), so that we can write and unit-test that in isolation. I.e. to split the decision what to copy and the execution of that.

I think that’s still something that would be valuable to build — the ability to cheaply unit test that logic, as well as splitting copyMultipleImages into smaller pieces, seem useful.

In the meantime, what this PR is now, just a migration from UpdateInstances to EditInstances is independently a valuable step. So let’s do it!


Code for this LGTM, just some naming / stylistic nits.

copy/multiple.go Outdated Show resolved Hide resolved
copy/multiple.go Outdated Show resolved Hide resolved
copy/multiple.go Outdated Show resolved Hide resolved
mtrmac

This comment was marked as duplicate.

After containers#1896 `c/image` now
supports new API `EditInstances` which is more flexible in terms of
`Adding` new instances or `Modifying` exisiting instances so lets use
that.

Signed-off-by: Aditya R <[email protected]>
@flouthoc flouthoc force-pushed the refactor-copy-multiple-images branch from ea844c1 to b70f961 Compare June 3, 2023 04:17
@flouthoc
Copy link
Contributor Author

flouthoc commented Jun 3, 2023

@mtrmac PTAL again.

@flouthoc flouthoc requested a review from mtrmac June 3, 2023 04:18
Copy link
Collaborator

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mtrmac mtrmac merged commit f5f7ac8 into containers:main Jun 3, 2023
@flouthoc
Copy link
Contributor Author

flouthoc commented Jun 4, 2023

@mtrmac I created a new PR to split this #1982

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature A request for, or a PR adding, new functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants