Skip to content

Commit

Permalink
Updating the sum database documentation (gomods#1494)
Browse files Browse the repository at this point in the history
* Updating the sum database documentation
  • Loading branch information
arschles authored Feb 21, 2020
1 parent ed703ee commit 9624953
Showing 1 changed file with 41 additions and 23 deletions.
64 changes: 41 additions & 23 deletions docs/content/configuration/sumdb.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,48 +4,66 @@ description: How to configure Athens to proxy a checksum database API, and why y
weight: 4
---

## Proxying A Checksum DB
If you run `go get github.com/mycompany/[email protected]` and that module version is not yet in your `go.sum` file, Go will by default send a request to `https://sum.golang.org/lookup/github.com/mycompany/[email protected]`. That request will fail because the Go tool requires a checksum, but `sum.golang.org` doesn't have access to your private code.

The Athens Proxy has the ability to proxy a Checksum Database as defined by [this proposal](https://go.googlesource.com/proposal/+/master/design/25530-sumdb.md) by the Go team.
The result is that *(1) your build will fail*, and *(2) your private module names have been sent over the internet to an opaque public server that you don't control*.

Athens by default will accept proxying `https://sum.golang.org`. However, if you'd like to override that behavior or proxy more Checksum DBs you can do so through the `SumDBs` config or its equivalent Environment Variable: `ATHENS_SUM_DBS`.
>You can read more about this `sum.golang.org` service [here](https://go.googlesource.com/proposal/+/master/design/25530-sumdb.md)
So for example, if you run the following command:
## Proxying a checksum DB

```bash
GOPROXY=<athens-url> go build
```
Many companies use Athens to host their private code, but Athens is not only a module proxy. It's also a checksum database proxy. That means that anyone inside of your company can configure `go` to send these checksum requests to Athens instead of the public `sum.golang.org` server.

The Go command will proxy requests to `sum.golang.org` like this: `<athens-url>/sumdb/sum.golang.org`. Feel free to read the linked proposal above for the exact requests that makes Athens successfully proxy Checksum DB APIs.
If the Athens server is configured with checksum filters, then you can prevent these problems.

Note that as of this documentation (May 2019), you need to explicitly set `GOSUMDB=https://sum.golang.org`, but the Go team is planning on enabling this by default.
If you run the below command using Go 1.13 or later:

### Why Proxy a Checksum DB?
```bash
$ GOPROXY=<athens-url> go build .
```

This is quite important. Say you are a company that is running an Athens instance, and you don't want the world to know about where your
repositories live. For example, say you have a private repo under `github.com/mycompany/secret-repo`. In order to ensure that the Go client
does not send a request to `https://sum.golang.org/lookup/github.com/mycompany/[email protected]` and therefore leaking your private import path to the public, you need to ensure that you tell Go to skip particular import paths by setting the `GONOSUMDB` environment variable:
... then the Go tool will automatically send all checksum requests to `<athens-url>/sumdb/sum.golang.org` instead of `https://sum.golang.org`.

```
GONOSUMDB=github.com/mycompany/* go build
```
By default, when Athens receives a `/sumdb/...` request, it automatically proxies it to `https://sum.golang.org`, even if it's a private module that `sum.golang.org` doesn't and can't know about. So if you are working with private modules, you'll want to change the default behavior.

>If you want Athens to _not_ send some module names up to the global checksum database, set those module names in the `NoSumPatterns` value in `config.toml` or using the `ATHENS_GONOSUM_PATTERNS` environment variable.
This will make sure that Go does not send any requests to the Checksum DB for your private import paths. However, how can you ensure that all of your employees are building private code with the right configuration?
The following sections will go into more detail on how checksum databases work, how Athens fits in, and how this all impacts your workflow.

Athens, in this case can help ensure that all private code flowing through it is never checked against the checksum DB. That means that as long as your employees are using Athens, then they will get a helpful reminder to ensure Their GONOSUMDB is rightly configured.
## How to set this all up

Athens allows the operator running the server to specify a list of patterns to not lookup via the upstream sum provider:
Before you begin, you'll need to run Athens with configuration values that tell it to not proxy certain modules. If you're using `config.toml`, use this configuration:

```toml
NoSumPatterns = ["github.com/mycompany/*", "github.com/secret/*"]
```

Or you can do it with an environment variable:
And if you're using an environment variable, use this configuration:

```console
```bash
$ export ATHENS_GONOSUM_PATTERNS="github.com/mycompany/*,github.com/secret/*"
```

>In both of the above configuration examples, there are two patterns specified, and they are separated by a comma (`,`) in both cases. When you are using the environment variable (`export ATHENS_GONOSUM_PATTERNS`), make sure you don't use brackes (`[` and `]`).
>You can use any string compatible with [`path.Match`](https://pkg.go.dev/path?tab=doc#Match) in these environment variables
After you start Athens up with this configuration, all checksum requests for modules that start with `github.com/mycompany` or `github.com/secret` will not be forwarded, and Athens will return an error to the `go` CLI tool.

This behavior will ensure that none of your private module names leak to the public internet, but your builds will still fail. To fix that problem, set another environment variable on your machine (that you run your `go` commands)

```bash
$ export GONOSUMDB="github.com/mycompany/*,github.com/secret/*"
```

Now, your builds will work and you won't be sending information about your private codebase to the internet.

## I'm confused, why is this hard?

When the Go tool has to download _new_ code that isn't currently in the project's `go.sum` file, it tries its hardest to get a checksum from a server it trusts, and compare it to the checksum in the actual code it downloads. It does all of this to ensure _provenance_. That is, to ensure that the code you just downloaded wasn't tampered with.

The trusted checksums are all stored in `sum.golang.org`, and that server is centrally controlled.

>These build failures and potential privacy leaks can only happen when you try to get a module version that is _not_ already in your `go.sum` file.
Athens does its best to respect and use the trusted checksums while also ensuring that your private names don't get leaked to the public server. In some cases, it has to choose whether to fail your build or leak information, so it chooses to fail your build. That's why everybody using that Athens server needs to set up their `GONOSUMDB` environment variable.

Either way, this configuration will ensure that when Go sends a request to `<athens-url>/sumdb/sum.golang.org/github.com/mycompany/[email protected]`, Athens will return a 403 and failing the build ensuring that the client knows something is not configured correctly and also never leaking those import paths.
We believe that along with good documentation - which we hope this is! - we have struck the right balance between convenience and privacy.

0 comments on commit 9624953

Please sign in to comment.