Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ByteString type to support any CBOR byte string #376

Merged
merged 5 commits into from
Dec 29, 2022

Conversation

fxamacker
Copy link
Owner

@fxamacker fxamacker commented Dec 29, 2022

Problem

By default, CBOR byte string is decoded as []byte, but Go doesn't allow []byte as map keys. Given this, decoding a CBOR map with byte string keys produced "invalid map key type" error.

Many thanks to @agaffney for his patience after raising this issue and for opening PR #319 which is included in this PR with permission.

Closes #312

Solution

This PR defines ByteString type to support CBOR byte string (major type 2) without being limited to map keys.

ByteString can be used when using a []byte is not possible or convenient. For example, Go doesn't allow []byte as map key, so ByteString can be used to support data formats having CBOR map with byte string keys. Another use for ByteString is to encode invalid UTF-8 string as CBOR byte string.

By default, this PR allows decoding CBOR maps with keys of type:

  • byte string
  • tagged byte string
  • nested tagged byte string

When decoding CBOR maps, the above CBOR key types can be decoded to:

  • empty Go interface value
  • Go map with key of empty interface type

Changes

  • Added ByteString with underlying string type.
  • Implemented cbor.Marshaller and cbor.Unmarshaller for ByteString.
  • Added decoding options: MapKeyByteStringForbidden and MapKeyByteStringAllowed.
  • Added support for decoding CBOR map keys of type: byte string, tagged byte string, and nested tagged byte string
  • Added InvalidMapKeyTypeError.
  • Added tests for ByteString.

Caveats

⚠️ By default, this PR enables decoding CBOR maps with byte string keys. To enable the old behavior of returning decoding error, use the option MapKeyByteStringForbidden.

agaffney and others added 5 commits December 27, 2022 21:02
This allows the parsing of CBOR data containing maps with byte string
keys, which is commonly seen in Cardano blockchain data. The default
behavior remains the legacy behavior (throw an error) for backward
compatibility.

Signed-off-by: Andrew Gaffney <[email protected]>
This commit makes ByteString support CBOR byte string (major type 2)
without being limited to map keys.

ByteString can be used when using a []byte is not possible or
convenient.  For example, Go doesn't allow []byte as map key, so
ByteString can be used to support data formats having CBOR map with byte
string keys. ByteString can also be used to encode invalid UTF-8 string
as CBOR byte string.

- Modified ByteString to use string type.
- Implemented cbor.Marshaller and cbor.Unmarshaller for ByteString.
- Simplified ByteString encoding and decoding.
- Renamed MapKeyByteStringFail to MapKeyByteStringForbidden.
- Renamed MapKeyByteStringWrap to MapKeyByteStringAllowed.
- Added more tests for ByteString.
Also added support for:
- CBOR tag with byte string tag content as map key.
- nested CBOR tag with byte string tag content as map key.

This new feature can be disabled by specifying decoding option
MapKeyByteStringForbidden, which will return an error
when attempting to decode CBOR byte string as map key
into an empty Go interface value.
InvalidMapKeyTypeError is returned when attempting
to decode CBOR map key to a Go type that cannot
be used as map key.

This allows callers to check for this error without
requiring string comparison of the error text.

To maintain backward compatibility, the text
of InvalidMapKeyTypeError remains the same
as the plain error that used to be returned.
Added support for decoding CBOR map with byte string keys
to Go map with empty interface key type.

This supports decoding these CBOR map key types:
- byte string
- tagged byte string
- nested tagged byte string

Previous commit already added support for decoding CBOR maps
with byte string keys to empty Go interface value.
@fxamacker fxamacker added bug Something isn't working enhancement New feature or request labels Dec 29, 2022
@fxamacker
Copy link
Owner Author

@agaffney Thanks again for raising this issue and opening PR #319. If you have time, please take a look and let me know if this PR works for you.

This PR includes PR #319 and some of the differences are:

  • by default, this PR allows decoding map keys of type byte string, tagged byte string, nested byte string
  • ByteString type is defined as string type instead of struct containing string

BTW, thanks for the link to Cardano's CDDL. That was really helpful in understanding the usefulness of adding this feature. It's one of the main reasons I enabled this feature by default.

@fxamacker fxamacker added this to the v2.5.0 milestone Dec 29, 2022
@fxamacker fxamacker removed the bug Something isn't working label Dec 29, 2022
Copy link
Contributor

@agaffney agaffney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me

Copy link
Contributor

@x448 x448 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! Fuzzing didn't find any crashers yet.

@fxamacker
Copy link
Owner Author

Passed fuzzing overnight and today. Merging.

@fxamacker fxamacker merged commit 7c3a599 into master Dec 29, 2022
@fxamacker fxamacker deleted the feature/bytestring branch January 2, 2023 00:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cannot decode CBOR data with unknown structure containing map with []uint8 keys
3 participants