Skip to content

Commit

Permalink
Limit sets to only allow fewer types
Browse files Browse the repository at this point in the history
Sets can now only contain string, blob, byte, short, integer, long, bigInteger,
and bigDecimal shapes. Sets with other types of values are either difficult
to implement in various programming languages (for example, sets of floats in
Rust), or highly problematic for client/server use cases. Clients that are out
of sync with a service model could receive structures or unions from a
service, not recognize new members and drop them, causing the hash codes of
members of the set to collide, and this would result in the client discarding
set entries. For example, a service might return a set of 3 structures, but
when clients deserialize them, they drop unknown members, and the set
contains fewer than 3 entries.

Existing models that already use a set of other types will need to migrate to
use a list rather than a set, and they will need to implement any necessary
uniqueness checks server-side as needed.

Support blob sets, do spec and validation cleanup
  • Loading branch information
mtdowling committed Feb 24, 2022
1 parent b2911b2 commit 6dabbd3
Show file tree
Hide file tree
Showing 16 changed files with 195 additions and 209 deletions.
19 changes: 19 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,24 @@
# Smithy Changelog

## Next Release (TBD)

### Breaking changes

* Sets can now only contain byte, short, integer, long, bigInteger, bigDecimal,
and string shapes. Sets with other types of values are either difficult to
implement in various programming languages (for example, sets of floats in
Rust), or highly problematic for client/server use cases. Clients that are
out of sync with a service model could receive structures or unions from a
service, not recognize new members and drop them, causing the hash codes of
members of the set to collide, and this would result in the client discarding
set entries. For example, a service might return a set of 3 structures, but
when clients deserialize them, they drop unknown members, and the set
contains fewer than 3 entries.

Existing models that already use a set of other types will need to migrate to
use a list rather than a set, and they will need to implement any necessary
uniqueness checks server-side as needed.

## 1.17.0 (2022-02-04)

### Bug Fixes
Expand Down
14 changes: 11 additions & 3 deletions docs/source/1.0/spec/core/idl.rst
Original file line number Diff line number Diff line change
Expand Up @@ -806,10 +806,12 @@ Traits can be applied to the set shape and its members:

@deprecated
set StringSet {
@sensitive
member: String
member: SensitiveString
}

@sensitive
string SensitiveString

.. code-tab:: json

{
Expand All @@ -818,11 +820,17 @@ Traits can be applied to the set shape and its members:
"smithy.example#StringSet": {
"type": "set",
"member": {
"target": "smithy.api#String"
"target": "smithy.example#SensitiveString"
},
"traits": {
"smithy.api#deprecated": {}
}
},
"smithy.example#SensitiveString": {
"type": "string",
"traits": {
"smithy.api#sensitive": {}
}
}
}
}
Expand Down
27 changes: 11 additions & 16 deletions docs/source/1.0/spec/core/model.rst
Original file line number Diff line number Diff line change
Expand Up @@ -458,7 +458,7 @@ reference other shapes using :ref:`members <member>`.
* - :ref:`list`
- Ordered collection of homogeneous values
* - :ref:`set`
- Collection of unique homogeneous values
- Ordered collection of unique homogeneous values
* - :ref:`map`
- Map data structure that maps string keys to homogeneous values
* - :ref:`structure`
Expand Down Expand Up @@ -565,10 +565,13 @@ example is ``smithy.example#MyList$member``.
Set
===

The :dfn:`set` type represents a collection of unique homogeneous
values. A set shape requires a single member named ``member``.
Sets are defined in the IDL using a :ref:`set_statement <idl-set>`.
The following example defines a set of strings:
The :dfn:`set` type represents an ordered collection of unique values. A set
shape requires a single member named ``member``, and the member MUST target
either a string, blob, byte, short, integer, long, bigInteger, or bigDecimal
shape. The targeted shape MUST NOT be marked with the :ref:`streaming-trait`.

Sets are defined in the IDL using a :ref:`set_statement <idl-set>`. The
following example defines a set of strings:

.. tabs::

Expand Down Expand Up @@ -611,20 +614,12 @@ The shape ID of the member of a set is the set shape ID followed by
``$member``. For example, the shape ID of the set member in the above
example is ``smithy.example#StringSet$member``.

.. rubric:: Language support for sets
.. rubric:: Language support for insertion ordered sets

Not all programming languages support set data structures. Such languages
SHOULD represent sets as a custom set data structure that can interpret value
hash codes and equality, or alternatively, store the values of a set data
Not all programming languages support an insertion ordered set data
structure. Such languages SHOULD store the values of a set data
structure in a list and rely on validation to ensure uniqueness.

.. rubric:: Set member ordering

Sets MUST be insertion ordered. Not all programming languages that support
sets support ordered sets, requiring them may be overly burdensome for users,
or conflict with language idioms. Such languages SHOULD store the values
of sets in a list and rely on validation to ensure uniqueness.


.. _map:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,50 +34,6 @@ apply MalformedSet @httpMalformedRequestTests([
}
}
},
{
id: "RestJsonMalformedSetDuplicateDocuments",
documentation: """
When the set has duplicated documents, the response should be a 400
SerializationException.""",
protocol: restJson1,
request: {
method: "POST",
uri: "/MalformedSet",
body: """
{ "docSet" : [{"a": 1}, {"b": 2, "c": 3}, {"c": 3, "b": 2}] }""",
headers: {
"content-type": "application/json"
}
},
response: {
code: 400,
headers: {
"x-amzn-errortype": "SerializationException"
}
}
},
{
id: "RestJsonMalformedSetDuplicateTimestamps",
documentation: """
When the set has duplicated timestamps, the response should be a 400
SerializationException.""",
protocol: restJson1,
request: {
method: "POST",
uri: "/MalformedSet",
body: """
{ "tsSet" : [1515531081, 1423235322, 1515531081] }""",
headers: {
"content-type": "application/json"
}
},
response: {
code: 400,
headers: {
"x-amzn-errortype": "SerializationException"
}
}
},
{
id: "RestJsonMalformedSetDuplicateBlobs",
documentation: """
Expand All @@ -100,51 +56,6 @@ apply MalformedSet @httpMalformedRequestTests([
}
}
},
{
id: "RestJsonMalformedSetDuplicateStructures",
documentation: """
When the set has duplicated structures, the response should be a 400
SerializationException.""",
protocol: restJson1,
request: {
method: "POST",
uri: "/MalformedSet",
body: """
{ "structSet" : [{"foo": "baz", "bar": 1}, {"foo": "quux", "bar": 2}, {"bar": 1, "foo": "baz"}] }""",
headers: {
"content-type": "application/json"
}
},
response: {
code: 400,
headers: {
"x-amzn-errortype": "SerializationException"
}
}
},
{
id: "RestJsonMalformedSetDuplicateStructuresWithNullValues",
documentation: """
When the set has duplicated structures, where one structure has an
explicit null value and the other leaves the member undefined,
the response should be a 400 SerializationException.""",
protocol: restJson1,
request: {
method: "POST",
uri: "/MalformedSet",
body: """
{ "structSet" : [{"foo": null, "bar": 1}, {"foo": "quux", "bar": 2}, {"bar": 1}] }""",
headers: {
"content-type": "application/json"
}
},
response: {
code: 400,
headers: {
"x-amzn-errortype": "SerializationException"
}
}
},
{
id: "RestJsonMalformedSetNullItem",
documentation: """
Expand All @@ -171,36 +82,13 @@ apply MalformedSet @httpMalformedRequestTests([

structure MalformedSetInput {
set: SimpleSet,
docSet: DocumentSet,
tsSet: TimestampSet,
blobSet: BlobSet,
structSet: StructSet
blobSet: BlobSet
}

set SimpleSet {
member: String
}

set DocumentSet {
member: Document
}

set TimestampSet {
member: Timestamp
}

set BlobSet {
member: Blob
}

set StructSet {
member: StructForSet
}

structure StructForSet {
foo: String,
bar: Integer
}

document Document

Original file line number Diff line number Diff line change
Expand Up @@ -216,26 +216,6 @@ public void detectsAcceptableListMemberChangesInNestedTargets() {
+ "backward compatible."));
}

@Test
public void detectsAcceptableSetMemberChangesInNestedTargets() {
Model modelA = Model.assembler()
.addImport(getClass().getResource("changed-member-target-valid-nested2-a.smithy"))
.assemble()
.unwrap();
Model modelB = Model.assembler()
.addImport(getClass().getResource("changed-member-target-valid-nested2-b.smithy"))
.assemble()
.unwrap();
List<ValidationEvent> events = ModelDiff.compare(modelA, modelB);

assertThat(TestHelper.findEvents(events, "ChangedMemberTarget").size(), equalTo(1));
assertThat(TestHelper.findEvents(events, Severity.WARNING).size(), equalTo(1));
assertThat(TestHelper.findEvents(events, "ChangedMemberTarget").get(0).getMessage(),
equalTo("The shape targeted by the member `smithy.example#A$member` changed from "
+ "`smithy.example#B1` (set) to `smithy.example#B2` (set). This was determined "
+ "backward compatible."));
}

@Test
public void detectsInvalidListMemberChangesInNestedTargets() {
Model modelA = Model.assembler()
Expand Down

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
import software.amazon.smithy.model.shapes.Shape;
import software.amazon.smithy.model.shapes.ShapeId;
import software.amazon.smithy.model.shapes.ShapeType;
import software.amazon.smithy.model.traits.StreamingTrait;
import software.amazon.smithy.model.traits.TraitDefinition;
import software.amazon.smithy.model.validation.AbstractValidator;
import software.amazon.smithy.model.validation.ValidationEvent;
Expand All @@ -50,6 +51,9 @@ public final class TargetValidator extends AbstractValidator {
private static final int MAX_EDIT_DISTANCE_FOR_SUGGESTIONS = 2;
private static final Set<ShapeType> INVALID_MEMBER_TARGETS = SetUtils.of(
ShapeType.SERVICE, ShapeType.RESOURCE, ShapeType.OPERATION, ShapeType.MEMBER);
private static final Set<ShapeType> VALID_SET_TARGETS = SetUtils.of(
ShapeType.STRING, ShapeType.BYTE, ShapeType.SHORT, ShapeType.INTEGER, ShapeType.LONG,
ShapeType.BIG_INTEGER, ShapeType.BIG_DECIMAL, ShapeType.BLOB);

@Override
public List<ValidationEvent> validate(Model model) {
Expand Down Expand Up @@ -89,6 +93,8 @@ private Optional<ValidationEvent> validateTarget(Model model, Shape shape, Shape
}
case MAP_KEY:
return target.asMemberShape().flatMap(m -> validateMapKey(shape, m.getTarget(), model));
case SET_MEMBER:
return target.asMemberShape().flatMap(m -> validateSetMember(shape, m.getTarget(), model));
case RESOURCE:
if (target.getType() != ShapeType.RESOURCE) {
return Optional.of(badType(shape, target, relType, ShapeType.RESOURCE));
Expand Down Expand Up @@ -144,6 +150,30 @@ private Optional<ValidationEvent> validateMapKey(Shape shape, ShapeId target, Mo
"Map key member targets %s, but is expected to target a string", resolved)));
}

private Optional<ValidationEvent> validateSetMember(Shape shape, ShapeId target, Model model) {
Shape targetShape = model.getShape(target).orElse(null);

if (targetShape == null) {
return Optional.empty();
}

if (!VALID_SET_TARGETS.contains(targetShape.getType())) {
return Optional.of(error(shape, format(
"Set member targets %s, but sets can target only %s. You can model a collection of %s shapes "
+ "by changing this shape to a list. Modeling a set of values of other types is problematic "
+ "to support across a wide range of programming languages.",
targetShape,
VALID_SET_TARGETS.stream().map(ShapeType::toString).sorted().collect(Collectors.joining(", ")),
targetShape.getType())));
} else if (targetShape.hasTrait(StreamingTrait.class)) {
return Optional.of(error(shape, format("Set member targets %s, a shape marked with the @streaming trait. "
+ "Sets do not support unbounded values.",
targetShape)));
}

return Optional.empty();
}

private Optional<ValidationEvent> validateIdentifier(Shape shape, Shape target) {
if (target.getType() != ShapeType.STRING) {
return Optional.of(badType(shape, target, RelationshipType.IDENTIFIER, ShapeType.STRING));
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
[ERROR] smithy.example#TimestampSet: Set member targets (timestamp: `smithy.api#Timestamp`), but sets can target only bigDecimal, bigInteger, blob, byte, integer, long, short, string. You can model a collection of timestamp shapes by changing this shape to a list. Modeling a set of values of other types is problematic to support across a wide range of programming languages. | Target
[ERROR] smithy.example#BooleanSet: Set member targets (boolean: `smithy.api#Boolean`), but sets can target only bigDecimal, bigInteger, blob, byte, integer, long, short, string. You can model a collection of boolean shapes by changing this shape to a list. Modeling a set of values of other types is problematic to support across a wide range of programming languages. | Target
[ERROR] smithy.example#DoubleSet: Set member targets (double: `smithy.api#Double`), but sets can target only bigDecimal, bigInteger, blob, byte, integer, long, short, string. You can model a collection of double shapes by changing this shape to a list. Modeling a set of values of other types is problematic to support across a wide range of programming languages. | Target
[ERROR] smithy.example#FloatSet: Set member targets (float: `smithy.api#Float`), but sets can target only bigDecimal, bigInteger, blob, byte, integer, long, short, string. You can model a collection of float shapes by changing this shape to a list. Modeling a set of values of other types is problematic to support across a wide range of programming languages. | Target
[ERROR] smithy.example#DocumentSet: Set member targets (document: `smithy.api#Document`), but sets can target only bigDecimal, bigInteger, blob, byte, integer, long, short, string. You can model a collection of document shapes by changing this shape to a list. Modeling a set of values of other types is problematic to support across a wide range of programming languages. | Target
[ERROR] smithy.example#ListSet: Set member targets (list: `smithy.example#StringList`), but sets can target only bigDecimal, bigInteger, blob, byte, integer, long, short, string. You can model a collection of list shapes by changing this shape to a list. Modeling a set of values of other types is problematic to support across a wide range of programming languages. | Target
[ERROR] smithy.example#SetSet: Set member targets (set: `smithy.example#StringSet`), but sets can target only bigDecimal, bigInteger, blob, byte, integer, long, short, string. You can model a collection of set shapes by changing this shape to a list. Modeling a set of values of other types is problematic to support across a wide range of programming languages. | Target
[ERROR] smithy.example#MapSet: Set member targets (map: `smithy.example#StringMap`), but sets can target only bigDecimal, bigInteger, blob, byte, integer, long, short, string. You can model a collection of map shapes by changing this shape to a list. Modeling a set of values of other types is problematic to support across a wide range of programming languages. | Target
[ERROR] smithy.example#StructSet: Set member targets (structure: `smithy.example#StructureExample`), but sets can target only bigDecimal, bigInteger, blob, byte, integer, long, short, string. You can model a collection of structure shapes by changing this shape to a list. Modeling a set of values of other types is problematic to support across a wide range of programming languages. | Target
[ERROR] smithy.example#UnionSet: Set member targets (union: `smithy.example#UnionExample`), but sets can target only bigDecimal, bigInteger, blob, byte, integer, long, short, string. You can model a collection of union shapes by changing this shape to a list. Modeling a set of values of other types is problematic to support across a wide range of programming languages. | Target
[ERROR] smithy.example#StreamingBlobSet: Set member targets (blob: `smithy.example#StreamingBlob`), a shape marked with the @streaming trait. | Target
[ERROR] smithy.example#StreamingStringSet: Set member targets (string: `smithy.example#StreamingString`), a shape marked with the @streaming trait. | Target
Loading

0 comments on commit 6dabbd3

Please sign in to comment.