Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split specification.md into three separate files. #23

Merged
merged 3 commits into from
Mar 23, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 22 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,37 @@

Simple, foolproof standard for signing arbitrary data.

* Why not [JOSE/JWS/JWT](https://jwt.io)? JSON-specific, too complicated, too
easy to mess up.
* Why not [PASETO](https://paseto.io)? JSON-specific, too opinionated.
## Features

* Supports arbitrary message encodings, not just JSON.
* Authenticates the message *and* the type to avoid confusion attacks.
* Avoids canonicalization to reduce attack surface.
* Allows any desired crypto primitives or libraries.

See [Background](background.md) for more information, including design
considerations and rationale.

## What is it?

* [Signature protocol](specification.md)
* [Data structure](specification.md) for storing the message and signatures
Specifications for:

* [Protocol](protocol.md) (*required*)
* [Data structure](envelope.md), a.k.a. "Envelope" (*recommended*)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 for "Envelope" instead of "Wrapper". The latter always had a different connotation for me.

* (pending #9) Suggested crypto primitives

Out of scope (for now at least):

* Key management / PKI

## Why not...?

* Why not raw signatures? Too fragile.
* Why not [JOSE/JWS/JWT](https://jwt.io)? JSON-specific, too complicated, too
easy to mess up.
* Why not [PASETO](https://paseto.io)? JSON-specific, too opinionated.

See [Background](background.md) for further motivation.

## Who uses it?

* [in-toto](https://in-toto.io) (pending [ITE-5](https://github.com/in-toto/ITE/pull/13))
Expand Down
167 changes: 167 additions & 0 deletions background.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
# Background

## What is the intended use case?

This can be used anywhere digital signatures are needed.

The initial application is for signing software supply chain metadata in [TUF]
and [in-toto].

## Why do we need this?

There is no other simple, foolproof signature scheme that we are aware of.

* Raw signatures are too fragle. Every public key must be used for exactly one
purpose over exactly one message type, lest the system be vulnerable to
[confusion attacks](#motivation). In many cases, this results in a difficult
key management problem.

* [TUF] and [in-toto] currently use a scheme that avoids these problems but is
JSON-specific and relies on [canonicalization](motivation.md), which is an
unnecessarily large attack surface.

* [JWS] is JSON-specific, complicated, and error-prone.

* [PASETO] is JSON-specific and too opinionated. For example, it mandates
ed25519 signatures, which may not be useful in all cases.

The intent of this project is to define a minimal signature scheme that avoids
these issues.

## Design requirements

The [protocol](protocol.md):

* MUST reduce the possibility of a client misinterpreting the payload (e.g.
interpreting a JSON message as protobuf)
* MUST support arbitrary payload types (e.g. not just JSON)
* MUST support arbitrary crypto primitives, libraries, and key management
systems (e.g. Tink vs openssl, Google KMS vs Amazon KMS)
* SHOULD avoid depending on canonicalization for security
* SHOULD NOT require unnecessary encoding (e.g. base64)
* SHOULD NOT require the verifier to parse the payload before verifying

The [data structure](encoding.md):

* MUST include both message and signature(s)
* NOTE: Detached signatures are supported by having the included message
contain a cryptographic hash of the external data.
* MUST support multiple signatures in one structure / file
* SHOULD discourage users from reading the payload without verifying the
signatures
* SHOULD be easy to parse using common libraries (e.g. JSON)
* SHOULD support a hint indicating what signing key was used

## Motivation

There are two concerns with the current [in-toto]/[TUF] signature envelope.

First, the signature scheme depends on [Canonical JSON], which has one practical
problem and two theoretical ones:

1. Practical problem: It requires the payload to be JSON or convertible to
JSON. While this happens to be true of in-toto and TUF today, a generic
signature layer should be able to handle arbitrary payloads.
1. Theoretical problem 1: Two semantically different payloads could have the
same canonical encoding. Although there are currently no known attacks on
Canonical JSON, there have been attacks in the past on other
canonicalization schemes
([example](https://latacora.micro.blog/2019/07/24/how-not-to.html#canonicalization)).
It is safer to avoid canonicalization altogether.
1. Theoretical problem 2: It requires the verifier to parse the payload before
verifying, which is both error-prone—too easy to forget to verify—and an
unnecessarily increased attack surface.

The preferred solution is to transmit the encoded byte stream exactly as it was
signed, which the verifier verifies before parsing. This is what is done in
[JWS] and [PASETO], for example.

Second, the scheme does not include an authenticated "context" indicator to
ensure that the signer and verifier interpret the payload in the same exact way.
For example, if in-toto were extended to support CBOR and Protobuf encoding, the
signer could get a CI/CD system to produce a CBOR message saying X and then a
verifier to interpret it as a protobuf message saying Y. While we don't know of
an exploitable attack on in-toto or TUF today, potential changes could introduce
such a vulnerability. The signature scheme should be resilient against these
classes of attacks. See [example attack](hypothetical_signature_attack.ipynb)
for more details.

## Reasoning

Our goal was to create a signature envelope that is as simple and foolproof as
possible. Alternatives such as [JWS] are extremely complex and error-prone,
while others such as [PASETO] are overly specific. (Both are also
JSON-specific.) We believe our proposal strikes the right balance of simplicity,
usefulness, and security.

Rationales for specific decisions:

- Why use base64 for payload and sig?

- Because JSON strings do not allow binary data, so we need to either
encode the data or escape it. Base64 is a standard, reasonably
space-efficient way of doing so. Protocols that have a first-class
concept of "bytes", such as protobuf or CBOR, do not need to use base64.

- Why sign raw bytes rather than base64 encoded bytes (as per JWS)?

- Because it's simpler. Base64 is only needed for putting binary data in a
text field, such as JSON. In other formats, such as protobuf or CBOR,
base64 isn't needed at all.

- Why does payloadType need to be signed?

- See [Motivation](#motivation).

- Why use PAE?

- Because we need an unambiguous way of serializing two fields,
payloadType and payload. PAE is already documented and good enough. No
need to reinvent the wheel.

- Why use a URI for payloadType rather than
[Media Type](https://www.iana.org/assignments/media-types/media-types.xhtml)
(a.k.a. MIME type)?

- Because Media Type only indicates how to parse but does not indicate
purpose, schema, or versioning. If it were just "application/json", for
example, then every application would need to impose some "type" field
within the payload, lest we have similar vulnerabilities as if
payloadType were not signed.
- Also, URIs don't need to be registered while Media Types do.

- Why not stay backwards compatible by requiring the payload to always be JSON
with a "_type" field? Then if you want a non-JSON payload, you could simply
have a field that contains the real payload, e.g. `{"_type":"my-thing",
"value":"base64…"}`.

1. It encourages users to add a "_type" field to their payload, which in
turn:
- (a) Ties the payload type to the authentication type. Ideally the
two would be independent.
- (b) May conflict with other uses of that same field.
- (c) May require the user to specify type multiple times with
different field names, e.g. with "@context" for
[JSON-LD](https://json-ld.org/).
2. It would incur double base64 encoding overhead for non-JSON payloads.
3. It is more complex than PAE.

## Backwards Compatibility

Backwards compatibility with the old [in-toto]/[TUF] format will be handled by
the application and explained in the corresponding application-specific change
proposal, namely [ITE-5](https://github.com/in-toto/ITE/pull/13) for in-toto and
via the principles laid out in
[TAP-14](https://github.com/theupdateframework/taps/blob/master/tap14.md) for
TUF.

Verifiers can differentiate between the
[old](https://github.com/in-toto/docs/blob/master/in-toto-spec.md#42-file-formats-general-principles)
and new envelope format by detecting the presence of the `payload` field (new
format) vs `signed` field (old format).

[Canonical JSON]: http://wiki.laptop.org/go/Canonical_JSON
[in-toto]: https://in-toto.io
[JWS]: https://tools.ietf.org/html/rfc7515
[PASETO]: https://github.com/paragonie/paseto/blob/master/docs/01-Protocol-Versions/Version2.md#sig
[TUF]: https://theupdateframework.io
66 changes: 66 additions & 0 deletions envelope.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# signing-spec Envelope

March 03, 2021

Version 0.1.0

This document describes the recommended data structure for storing signing-spec
signatures, which we call the "JSON Envelope". For the protocol/algorithm, see
[Protocol](protocol.md).

## Standard JSON envelope

The standard data structure for storing a signed message is a JSON message of
the following form, called the "JSON envelope":

```json
{
"payload": "<Base64(SERIALIZED_BODY)>",
"payloadType": "<PAYLOAD_TYPE>",
"signatures": [{
"keyid": "<KEYID>",
"sig": "<Base64(SIGNATURE)>"
}]
}
```

See [Protocol](protocol.md) for a definition of parameters and functions.

Empty fields may be omitted. [Multiple signatures](#multiple-signatures) are
allowed.

Base64() is [Base64 encoding](https://tools.ietf.org/html/rfc4648), transforming
a byte sequence to a unicode string. Either standard or URL-safe encoding is
allowed.

### Multiple signatures

An envelope may have more than one signature, which is equivalent to separate
envelopes with individual signatures.

```json
{
"payload": "<Base64(SERIALIZED_BODY)>",
"payloadType": "<PAYLOAD_TYPE>",
"signatures": [{
"keyid": "<KEYID_1>",
"sig": "<SIG_1>"
}, {
"keyid": "<KEYID_2>",
"sig": "<SIG_2>"
}]
}
```

## Other data structures

The standard envelope is JSON message with an explicit `payloadType`.
Optionally, applications may encode the signed message in other methods without
invalidating the signature:

- An encoding other than JSON, such as CBOR or Protobuf.
- Use a default `payloadType` if omitted and/or code `payloadType` as a
shorter string or enum.

At this point we do not standardize any other encoding. If a need arises, we may
do so in the future.
Loading