Merge pull request #3013 from jedevc/dev-docs

Enhanced developer documentation
moby · Sep 2, 2022 · 6d9b617 · 6d9b617
2 parents 9114527 + 07357d6
commit 6d9b617
Show file tree

Hide file tree

Showing 8 changed files with 1,516 additions and 580 deletions.
diff --git a/docs/dev/README.md b/docs/dev/README.md
@@ -0,0 +1,48 @@
+# BuildKit Developer Docs
+
+These are the BuildKit developer docs, designed to be read by technical users
+interested in contributing to or integrating with BuildKit.
+
+> **Warning**
+>
+> While these docs attempt to keep up with the current state of our `master`
+> development branch, the code is constantly changing and updating, as bugs are
+> fixed, and features are added. Remember, the ultimate source of truth is
+> always the code base.
+
+## Jargon
+
+The following terms are often used throughout the codebase and the developer
+documentation to describe different components and processes in the image build
+process.
+
+| Name | Description |
+| :--- | :---------- |
+| **LLB** | LLB stands for low-level build definition, which is a binary intermediate format used for defining the dependency graph for processes running part of your build. |
+| **Definition** | Definition is the LLB serialized using protocol buffers. This is the protobuf type that is transported over the gRPC interfaces. |
+| **Frontend** | Frontends are builders of LLB and may issue requests to Buildkit’s gRPC server like solving graphs. Currently there is only `dockerfile.v0` and `gateway.v0` implemented, but the gateway frontend allows running container images that function as frontends.  |
+| **State** | State is a helper object to build LLBs from higher level concepts like images, shell executions, mounts, etc. Frontends use the state API in order to build LLBs and marshal them into the definition. |
+| **Solver** | Solver is an abstract interface to solve a graph of vertices and edges to find the final result. An LLB solver is a solver that understands that vertices are implemented by container-based operations, and that edges map to container-snapshot results. |
+| **Vertex** | Vertex is a node in a build graph. It defines an interface for a content addressable operation and its inputs. |
+| **Op** | Op defines how the solver can evaluate the properties of a vertex operation. An op is retrieved from a vertex and executed in the worker. For example, there are op implementations for image sources, git sources, exec processes, etc. |
+| **Edge** | Edge is a connection point between vertices. An edge references a specific output a vertex’s operation. Edges are used as inputs to other vertices. |
+| **Result** | Result is an abstract interface return value of a solve. In LLB, the result is a generic interface over a container snapshot. |
+| **Worker** | Worker is a backend that can run OCI images. Currently, Buildkit can run with workers using either runc or containerd. |
+
+## Table of Contents
+
+The developer documentation is split across various files.
+
+For an overview of the process of building images:
+
+- [Request lifecycle](./request-lifecycle.md) - observe how incoming requests
+  are solved to produce a final artifact.
+- [Dockerfile to LLB](./dockerfile-llb.md) - understand how `Dockerfile`
+  instructions are converted to the LLB format.
+- [Solver](./solver.md) - understand how LLB is evaluated by the solver to
+  produce the solve graph.
+
+We also have a number of more specific guides:
+
+- [MergeOp and DiffOp](./merge-diff.md) - learn how MergeOp and DiffOp are
+  implemented, and how to program with them in LLB.
diff --git a/docs/dev/dockerfile-llb.md b/docs/dev/dockerfile-llb.md
@@ -0,0 +1,212 @@
+# Dockerfile conversion to LLB
+
+If you want to understand how Buildkit translates Dockerfile instructions into
+LLB, or you want to write your own frontend, then seeing how Dockerfile maps to
+using the Buildkit LLB package will give you a jump start.
+
+The `llb` package from Buildkit provides a chainable state object to help
+construct a LLB. Then you can marshal the state object into a definition using
+protocol buffers, and send it off in a solve request over gRPC.
+
+In code, these transformations are performed by the [`Dockerfile2LLB()`](../../frontend/dockerfile/dockerfile2llb/convert.go)
+function, which takes a raw `Dockerfile`'s contents and converts it to an LLB
+state, and associated image config, which are then both assembled in the
+[`Build()`](../../frontend/dockerfile/builder/build.go) function.
+
+## Basic examples
+
+Here are a few Dockerfile instructions you should be familiar with:
+
+- Base image
+
+  ```dockerfile
+  FROM golang:1.12
+  ```
+
+  ```golang
+  st := llb.Image("golang:1.12")
+  ```
+
+- Scratch image
+
+  ```dockerfile
+  FROM scratch
+  ```
+
+  ```golang
+  st := llb.Scratch()
+  ```
+
+- Environment variables
+
+  ```dockerfile
+  ENV DEBIAN_FRONTEND=noninteractive
+  ```
+
+  ```golang
+  st = st.AddEnv("DEBIAN_FRONTEND", "noninteractive")
+  ```
+
+- Running programs
+
+  ```dockerfile
+  RUN echo hello
+  ```
+
+  ```golang
+  st = st.Run(
+    llb.Shlex("echo hello"),
+  ).Root()
+  ```
+
+- Working directory
+
+  ```dockerfile
+  WORKDIR /path
+  ```
+
+  ```golang
+  st = st.Dir("/path")
+  ```
+
+## File operations
+
+This is where LLB starts to deviate from Dockerfile in features. In
+Dockerfiles, the run command is completely opaque to the builder and just
+executes the command. But in LLB, there are file operations that have better
+caching semantics and understanding of the command:
+
+- Copying files
+
+  ```dockerfile
+  COPY --from=builder /files/* /files
+  ```
+
+  ```golang
+  var CopyOptions = &llb.CopyInfo{
+    FollowSymlinks:      true,
+    CopyDirContentsOnly: true,
+    AttemptUnpack:       false,
+    CreateDestPath:      true,
+    AllowWildcard:       true,
+    AllowEmptyWildcard:  true,
+  }
+  st = st.File(
+    llb.Copy(builder, "/files/*", "/files", CopyOptions),
+  )
+  ```
+
+- Adding files
+
+  ```dockerfile
+  ADD --from=builder /files.tgz /files
+  ```
+
+  ```golang
+  var AddOptions = &llb.CopyInfo{
+    FollowSymlinks:      true,
+    CopyDirContentsOnly: true,
+    AttemptUnpack:       true,
+    CreateDestPath:      true,
+    AllowWildcard:       true,
+    AllowEmptyWildcard:  true,
+  }
+  st = st.File(
+    llb.Copy(builder, "/files.tgz", "files", AddOptions),
+  )
+  ```
+
+- Chaining file commands
+
+  ```dockerfile
+  # not possible without RUN in Dockerfile
+  RUN mkdir -p /some && echo hello > /some/file
+  ```
+
+  ```golang
+  st = st.File(
+    llb.Mkdir("/some", 0755),
+  ).File(
+    llb.Mkfile("/some/file", 0644, "hello"),
+  )
+  ```
+
+## Bind mounts
+
+Bind mounts allow unidirectional syncing of the host's local file system into
+the build environment.
+
+Bind mounts in Buildkit should not be confused with bind mounts in the linux
+kernel - they do not sync bidirectionally. Bind mounts are only a snapshot of
+your local state, which is specified through the `llb.Local` state object:
+
+- Using bind mounts
+
+  ```dockerfile
+  WORKDIR /builder
+  RUN --mount=type=bind,target=/builder \
+  PIP_INDEX_URL=https://my-proxy.com/pypi \
+      pip install .
+  ```
+
+  ```golang
+  localState := llb.Local(
+    "context",
+    llb.SessionID(client.BuildOpts().SessionID),
+    llb.WithCustomName("loading .")
+    llb.FollowPaths([]string{"."}),
+  )
+
+  execState = st.Dir("/builder").Run(
+    llb.Shlex("pip install ."),
+    llb.AddEnv(
+      "PIP_INDEX_URL",
+      "https://my-proxy.com/pypi",
+    ),
+  )
+  _ := execState.AddMount("/builder", localState)
+  // the return value of AddMount captures the resulting state of the mount
+  // after the exec operation has completed
+
+  st := execState.Root()
+  ```
+
+## Cache mounts
+
+Cache mounts allow for a shared file cache location between build invocations,
+which allow manually caching expensive operations, such as package downloads.
+Mounts have options to persist between builds with different sharing modes.
+
+- Using cache mounts
+
+  ```dockerfile
+  RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
+      --mount=type=cache,target=/var/lib/apt \
+      apt-get update
+  ```
+
+  ```golang
+  var VarCacheAptMount = llb.AddMount(
+    "/var/cache/apt",
+    llb.Scratch(),
+    llb.AsPersistentCacheDir(
+      "some-cache-id",
+      llb.CacheMountLocked,
+    ),
+  )
+
+  var VarLibAptMount = llb.AddMount(
+    "/var/lib/apt",
+    llb.Scratch(),
+    llb.AsPersistentCacheDir(
+      "another-cache-id",
+      llb.CacheMountShared,
+    ),
+  )
+
+  st := st.Run(
+    llb.Shlex("apt-get update"),
+    VarCacheAptMount,
+    VarLibAptMount,
+  ).Root()
+  ```