Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile using multi-stage build (take #2) #209

Closed
pascalandy opened this issue Feb 1, 2020 · 6 comments
Closed

Dockerfile using multi-stage build (take #2) #209

pascalandy opened this issue Feb 1, 2020 · 6 comments
Labels
Request Request for image modification or feature

Comments

@pascalandy
Copy link
Contributor

pascalandy commented Feb 1, 2020

Following on this closed issue. Reading this I feel it would make sense to do a PR.

Let me know if you interested. Thanks!

@wglambert wglambert added the Request Request for image modification or feature label Feb 3, 2020
@tianon
Copy link
Member

tianon commented Feb 4, 2020

I'm still confused on what switching to a multi-stage build here brings us? What's the benefit which we can't get right now that makes the switch worthwhile?

I'm also not sure which special case in https://github.com/docker-library/faq#multi-stage-builds this is falling into, but I guess that's really a specialization of my first question. 😅

@pascalandy
Copy link
Contributor Author

pascalandy commented Feb 4, 2020

Defining worthwhile is the right question here. Here is one angle:

Is 24% smaller worthwhile? I don't know.

IMHO, at scale, it's a lot of bandwidth for the planet when you addition the billions of pulls from every docker images (not only Ghost). AWS is happy that we don't multi-stage these images yet.

PS. The images above are for Ghost 2. Why? Because since Ghost V3 the alpine version is breaking.

@yosifkit
Copy link
Member

yosifkit commented Feb 5, 2020

We try hard to avoid multi-stage builds in almost every instance. One of the major issues we have with multi-stage builds is below.

On the official images build infrastructure, "we don't have a clean way to preserve the cache for the intermediate stages", so the layers will get deleted when we clean up images that are "ripe". The practical implication of this is that since the build cache for these untagged stages could be deleted at any time, we will end up spending extra time rebuilding them and users will pull "new" images that are unchanged.

(Ref docker-library/official-images#7134 (comment) and docker-library/official-images#5929 (comment))

The on-disk size is a bit interesting:

REPOSITORY          TAG                                  IMAGE ID            CREATED             SIZE
ghost                     2.38-alpine                          4a8ba9a19754        2 weeks ago         322MB
devmtl/ghostfire          2.37.0_2019-11-11_04H05s12_549e5d5   a0be566da8a1        2 months ago        225MB

What I would like to do is see where our image could change without a multi-stage build to improve both on-disk as well as transport size. I would guess that the biggest differences are not keeping ghost-cli, npm, yarn, and similar, and compressing the node binary with upx. Unfortunately I don't think we can remove those tools without major breakage. I would be hesitant to add multi-stage just to compress the node binary and I am suspect of its downsides (kubernetes/kubernetes#28265 (comment)).

@pascalandy
Copy link
Contributor Author

pascalandy commented Feb 5, 2020

Thank you for sharing the big picture here!

This is exactly what my multi-stage build (MSB) allows. This is the opportunity to save 20MB (or 100MB on disk).

not keeping ghost-cli, npm, yarn, and similar, and compressing the node binary with upx.

Do you mean at micro-level (this Ghost image) or at the macro-level (official core images like Node)? At the micro-level, I don't see the risk.

Unfortunately, I don't think we can remove those tools without major breakage.

A new Tag?

1) With MSB, we can push multiple docker images from the same Dockerfile. Idea: What about introducing a new a multistage tag?

2) If we are going that route, this would allows us to remove many Dockerfiles (yeah less maintenance!) and have a bigger MS Dockerfile that push N docker images.

Specifically, let's take the example of Node Alpine. All these images could be generated from the same Dockerfile (instead of 8):

  • 10.18.1-alpine3.9, 10.18-alpine3.9, 10-alpine3.9, dubnium-alpine3.9
  • 10.18.1-alpine3.10, 10.18-alpine3.10, 10-alpine3.10, dubnium-alpine3.10
  • 10.18.1-alpine3.11, 10.18-alpine3.11, 10-alpine3.11, dubnium-alpine3.11, 10.18.1-alpine, 10.18-alpine, 10-alpine, dubnium-alpine
  • 12.14.1-alpine3.9, 12.14-alpine3.9, 12-alpine3.9, erbium-alpine3.9, lts-alpine3.9, current-alpine3.9
  • 12.14.1-alpine3.10, 12.14-alpine3.10, 12-alpine3.10, erbium-alpine3.10, lts-alpine3.10, current-alpine3.10
  • 12.14.1-alpine3.11, 12.14-alpine3.11, 12-alpine3.11, erbium-alpine3.11, lts-alpine3.11, current-alpine3.11, 12.14.1-alpine, 12.14-alpine, 12-alpine, erbium-alpine, lts-alpine, current-alpine
  • 13.7.0-alpine3.10, 13.7-alpine3.10, 13-alpine3.10, alpine3.10
  • 13.7.0-alpine3.11, 13.7-alpine3.11, 13-alpine3.11, alpine3.11, 13.7.0-alpine, 13.7-alpine, 13-alpine, alpine

Using ARG and ENV, the maintenance would be fairly easy IMHO.

Caching

I thought this was working out of the box. If it is not, my whole argument is not holding :-/

since the build cache for these untagged stages could be deleted at any time, we will end up spending extra time rebuilding them, and users will pull "new" images that are unchanged.

Conclusion

So in the end, I just want to share the work I did for the multi-stage build on Ghost :) I certainly don't want to challenge you on a topic like this. I know you carefully tough about the process.

I know that you @yosifkit & @tianon been taking care of those docker images for years. I want to thank you. You are doing such a good job over here. Cheers!

@pascalandy
Copy link
Contributor Author

pascalandy commented Aug 11, 2020

For those who are curious, I feel this is one way to manage an MS Dockerfile.

@pascalandy
Copy link
Contributor Author

Closing this as it as no value on this repo. If you are curious, this is how we do it at FirePress.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Request Request for image modification or feature
Projects
None yet
Development

No branches or pull requests

4 participants