-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to set custom layer size via environment variables #374
Comments
This raises two questions for me. How would a client know the minimum size to use per chunk? Does this affect the ability to recover from a partial chunk upload? |
Can a registry implementation claim to be conformant if it sets a minimum size on chunks? |
@sudo-bmitch, a client can't really know I guess. It would really have to be part of the protocol for uploading layers/container images for the clients to actually make such assumptions. But regardless, it shouldn't affect the resuming of a failed part upload because AFAIK, when an upload fails, a retry really happens via uploading individual parts and not arbitrary slices. It's more like the way an individual layer is chunked at client-side would be different but not how they retry a failed part. |
@verokarhu, this is the big question for me. I personally feel like, this breaks the conformance. Defining minimum size on chunks wouldn't actually be all that simple, and it'll definitely require some changes in the spec, since nothing like this (or against it) exists in the spec. |
@jay-dee7 I'm not opposed to getting a recommended minimum chunk size in the spec since we recently added a recommended maximum manifest size. I'm leaning towards an inverse of a 413 http status code with a minimum size passed to the client from the server in the error, perhaps in a header. How does S3 implement this error? Perhaps there's some logic that can be copied. I'm one of the few clients that supports chunked uploads with multiple chunks, so there isn't much to break in this part of the spec. The conformance issues you encountered are because I recently added the tests after finding issues with multiple registries, so these are good things to sort out now. |
I'm a little glad to hear that. Looking at the AWS S3 docs (I don't have access to an AWS account to actually test it), the S3 API would return a If we want to follow a similar path to that of 413, maybe we do something like
If the layer is smaller than the said header value, client goes on to upload it via Single Post or Post then Put |
@sudo-bmitch do you think the above approach makes sense? |
This feels like the right path. We've been adding headers for other use cases, and adding it to the POST response gives clients the ability to avoid errors. |
SGTM |
After hearing others describe the scenario on the calls. +1 on the header. Looking forward to seeing a PR for the header. |
I'll work on a PR for the header and have that ready to discuss on our call next week. |
@jay-dee7 I've got a draft PR #391 up for review, however I don't have a registry available where it can be tested (I need a test for the test). I'm assuming at least some clients will do a final partial PATCH followed by a zero length PUT. I think working around that on the client side is feasible (requiring the final partial chunk to be a PUT), but figured I'd start with the most flexible option to minimize the changes to the spec. |
@sudo-bmitch this is great news. I can add this in OpenRegistry so that we can test it out. I'm super excited about this 🥳 🎉 |
A lot of the storage providers put restrictions on the minimum size of a single chunk (or part) that can be uploaded via something like a multipart upload. For example, AWS S3 says for a multipart upload, each part must be at least 5MB except for the last part, which can be any size. This scenario is true for most of the storage providers (for a good reason).
Inside the conformance testing suite, currently, we're using fixed length strings as chunks for the blobs. I was wondering if something like setting the size of the blobs would be possible via environment variables? Here's why:
Now, in our (as in
opencontainers/distribution-spec
) current implementation of conformance suite, tests usually have blobs (which really are just layers) as small as 23Bytes, and it is then chunked to even smaller sizes to be used in chunked layer uploading. This lead to a certain problem for us since one of our storage providers didn't support a chunk/part of size smaller than 5MB and that caused ourPush Conformance Tests
to fail, but when I tried uploading thehello-world
andbusybox
images (two of the smallest container images I know of), they both worked just fine and container image push was successful.So finally I had to go on and take a look at the conformance
setup.go
file and tweak a bit to set layer/blob size larger than 5MB and distribute them in two chunks:math/rand
to fill the layer/blob with random data (I hope that's okay)After this change, the conformance tests started passing. So this lead to me thinking about two scenarios:
Both of these situations seem right to me. But of course as a registry developer and maintainer, it's easier for me to just set an environment variable and get away with it (I also feel like it should be okay to set this size via environment variable?)
Please let me know what you guys think about this :)
The text was updated successfully, but these errors were encountered: