-
-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Deserialization for ErasedUnion
s
#27
Conversation
Necessary for InlayHint: Input (-> deserialization) of `inlayHint/resolve` Note: There are several limitations when deserializing: * Types in Cases must be distinguishable. Otherwise first matched case gets used. Example: `(U2.Second 42): U2<int,int>` -> serialized to `42` -> deserialized to `U2.First 42` * Because of `MissingMemberHandling.Ignore`: When Object in json, first Record gets matched. Example: ```fsharp type R1 = { Name: string; Value: int } type R2 = { Id: string; Value: int } let input: U2<R1, R2> = U2.Second { Id = "foo"; Value = 42 } let output: U2<R1,R2> = input |> serialize |> deserialize assert(output = U2.First { Name = null; Value = 42 }) ``` * //TODO: how to enable required/optional handling? There's `Required` in [`JsonPropertyAttribute`](https://www.newtonsoft.com/json/help/html/JsonPropertyRequired.htm). But we use `Option` -> should be `Option`: optional; Otherwise: required. Maybe via custom `ContractResolver`?
Note: Test project is currently just a simple Expecto-Program: `dotnet run --project ./tests/` to run tests. Add some simple Benchmarks to test Converters Run: `dotnet run --project ./tests/ -c Release -- --benchmarks --filter *` (Currently same project as Expecto Tests and switch with `--benchmarks`)
Was an issue with `U2<Record1, Record2>` -> always deserialized to `Record1` because all fields were optional. Now: only fields with type `Option<_>` are optional, all others are required -> deserialization fails when required field is not in json! (previously: silently used default value (like `null`)) Note: Currently doesn't respect `JsonProperty.Required`, but instead always determines required based on its type (`Option` vs. everything else). Should be ok in here: `JsonProperty.Required` is never used.
Reason: Null was disallowed
Add tests for property with different name than field (`JsonProperty("...")`)
Note: `ValueOption` is supported. Tests are just in case we want to add support later on. Tests will be removed again in next commit! `ValueNone` isn't covered by default `null` Handling -> required custom handling. To not emit Property with `ValueNone`: In `IContractResolver.CreateObjectContract(objectType)` add for property of type `ValueOption` handler for `GetIsSpecified` or `ShouldSerialize`. Issue: predicate receives not value of property, but object it's contained in. -> Property must be extracted first before checking if `ValueNone`
As mentioned in prev commit: was just for documentation purpose
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Allocated | |-----------------------|---------:|--------:|--------:|--------:|-------:|----------:| | All_Roundtrip_NoCache | 849.2 us | 0.94 us | 0.84 us | 31.2500 | 2.9297 | 258 KB | | All_Roundtrip_Cache | 811.9 us | 1.49 us | 1.39 us | 30.2734 | 2.9297 | 249 KB |
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Allocated | |----------------------|---------:|--------:|--------:|--------:|-------:|----------:| | All_Roundtrip | 811.9 us | 1.49 us | 1.39 us | 30.2734 | 2.9297 | 249 KB | | All_Roundtrip_Fields | 787.9 us | 6.80 us | 6.03 us | 28.3203 | 1.9531 | 237 KB |
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Allocated | |------------------- |---------:|--------:|--------:|--------:|-------:|----------:| | All_Roundtrip_Pre | 745.4 us | 1.26 us | 1.12 us | 28.3203 | 1.9531 | 237 KB | | All_Roundtrip_Post | 619.3 us | 2.96 us | 2.77 us | 24.4141 | 1.9531 | 207 KB |
Should be usable in other Converters too (not yet adjusted) | Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Allocated | |------------------- |---------:|--------:|--------:|--------:|-------:|----------:| | All_Roundtrip_Pre | 604.4 us | 1.06 us | 0.88 us | 24.4141 | 1.9531 | 207 KB | | All_Roundtrip_Post | 569.0 us | 4.36 us | 4.08 us | 23.4375 | 1.9531 | 196 KB |
| Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Allocated | |------------------- |---------:|--------:|--------:|--------:|-------:|----------:| | All_Roundtrip_Pre | 569.0 us | 4.36 us | 4.08 us | 23.4375 | 1.9531 | 196 KB | | All_Roundtrip_Post | 397.8 us | 0.47 us | 0.42 us | 20.5078 | 1.9531 | 171 KB |
No read difference between cache and no cache
`Newtonsoft.Json` is rather lax when parsing json: * `"42"` -> can be deserialized to int * `42` -> can be deserialized to string * `true` -> can be deserialized to string New StrictConverters prevent these cases: * String must inside quotation marks * Number (`int`, `float`) must not be inside quotation marks Note: This isn't a complete handling of special cases. Some examples: * For Numbers: currently only `int`, `float`, `byte` are supported for target type * No distinction between floating point and integer -> `3.14` can be deserialized to `int` (=`3`) -> Still put the most likely case first (is faster too) This strict handling is for correctly deserializing Erased Unions: * `U2<string, int>` and `42`: previously parsed to `U2.First "42"`, now `U2.Second 42` * `U2<int, string>` and `"42"`: previously `U2.First 42`, now `U2.Second "42"` * `U2<string, bool>` and `true`: previously `U2.First "true`, now `U2.Second true` Note: Probably more a "Came-up-in-tests" issue, than in practice: Not that many locations which accept two basic types, and then usually doesn't matter (like Id: doesn't really matter if int or string). Additional: (De)Serialization with these converters is slightly slower than without -> might be ok to disable StrictConverters again Remove `U2BoolObjectConverter` Converter expects second Type to be a record type and fails for basic types like `U2<bool, string>`. Additional: Is handled now by `ErasedUnionConverter` too -> no need for extra `U2BoolObjectConverter` And: Without `U2BoolObjectConverter` execution is actually faster | Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Allocated | |--------------------- |---------:|--------:|--------:|--------:|-------:|----------:| | U2Bool, No Strict | 394.4 us | 0.94 us | 0.84 us | 20.5078 | 1.9531 | 171 KB | | No U2Bool, No Strict | 379.5 us | 3.97 us | 3.71 us | 18.5547 | 1.4648 | 154 KB | | U2Bool, Strict | 408.7 us | 0.87 us | 0.77 us | 19.0430 | 1.4648 | 157 KB | | No U2Bool, Strict | 399.8 us | 0.96 us | 0.85 us | 18.0664 | 1.4648 | 148 KB |
| | Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Allocated | |--------|----------------------- |------------:|----------:|----------:|----------:|---------:|----------:| | typeof | All_Roundtrip | 400.0 us | 1.09 us | 1.02 us | 18.0664 | 1.4648 | 148 KB | | typeof | All_MultipleRoundtrips | 98,680.1 us | 143.35 us | 127.08 us | 4500.0000 | 333.3333 | 37,056 KB | |--------|----------------------- |------------:|----------:|----------:|----------:|---------:|----------:| | hash | All_Roundtrip | 376.8 us | 0.66 us | 0.62 us | 15.6250 | 1.4648 | 131 KB | | hash | All_MultipleRoundtrips | 94,789.2 us | 440.51 us | 412.06 us | 4000.0000 | 333.3333 | 32,773 KB |
Not really faster -- but way less allocations | Key | Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Allocated | |------|----------------------- |------------:|----------:|----------:|----------:|---------:|----------:| | type | All_Roundtrip | 370.0 us | 2.01 us | 1.88 us | 15.6250 | 1.4648 | 131 KB | | type | All_MultipleRoundtrips | 93,349.1 us | 197.76 us | 184.98 us | 4000.0000 | 333.3333 | 32,773 KB | |------|----------------------- |------------:|----------:|----------:|----------:|---------:|----------:| | hash | All_Roundtrip | 370.3 us | 1.37 us | 1.21 us | 12.2070 | 0.9766 | 102 KB | | hash | All_MultipleRoundtrips | 91,788.8 us | 385.98 us | 361.05 us | 3000.0000 | 166.6667 | 25,477 KB |
ErasedUnionConverter tries to fit json into first matching Case. If a case doesn't fit serialization to that case fails, which ErasedUnionConverter catches and then tries next case. But Exception handling is expensive -> basic types (string, bool, numbers) are now handled extra without exceptions Note: * still true: most likely case should come first (`U2<bool, Whatever>` when bool most likely) * and now additional: when unsure or no clear favorite put basic type first (`U2<string, MyRecord>` vs `U2<MyRecord, string>`) | | Method | Mean | Error | StdDev | Gen 0 | Gen 1 | Allocated | |-------|----------------------- |------------:|----------:|----------:|----------:|---------:|----------:| | ex | All_Roundtrip | 364.5 us | 1.95 us | 1.83 us | 12.2070 | 0.9766 | 102 KB | | ex | All_MultipleRoundtrips | 91,498.6 us | 432.69 us | 404.74 us | 3000.0000 | 166.6667 | 25,477 KB | |-------|----------------------- |------------:|----------:|----------:|----------:|---------:|----------:| | no ex | All_Roundtrip | 296.0 us | 1.28 us | 1.20 us | 11.7188 | 0.9766 | 98 KB | | no ex | All_MultipleRoundtrips | 73,219.7 us | 160.73 us | 142.48 us | 3000.0000 | 285.7143 | 24,567 KB |
No observable difference
Just build&pack&publish src, not tests
That's wanted for Fields (-> Uppercase in F#, lowercase property in JSON), but for Maps that shouldn't change. Issue in `WorkspaceEdit.Changes: Map<DocumentUri, TextEdit[]> option`: Key is Uri -> should be kept as is
Ok, found some additional issues with LSP Types & (de)serializing For now I'm looking at |
In specs: ```typescript /** * A data entry field that is preserved between a * `textDocument/publishDiagnostics` notification and * `textDocument/codeAction` request. * * @SInCE 3.16.0 */ data?: unknown; ``` -> best use `JToken` Note: In `InlayHint`, `data` is `LSPAny` -> `JToken` I think `Diagnostic.Data` is `unknown` just because it was introduced in an older LSP version
In specs: ```typescript /** * A data entry field that is preserved on a code action between * a `textDocument/codeAction` and a `codeAction/resolve` request. * * @SInCE 3.16.0 */ data?: LSPAny; ``` -> `JToken`
…o `bool option` specs: ```typescript /** * The client will send the `textDocument/semanticTokens/range` request * if the server provides a corresponding handler. */ range?: boolean | { }; /** * Server supports providing semantic tokens for a specific range * of a document. */ range?: boolean | { }; ``` Empty Object because: [[source](microsoft/vscode-languageserver-node#946 (comment))] > no it is not LSP any. It is basically an object literal with no properties. We did this sine we expect properties to be added which is easier if a object literal is already specified. -> `{}` not used -> omit `obj`
Otherwise equality check fails -> issue for tests
Add Benchmarks for type checks: `dotnet run -c Release -- --benchmarks --filter "Ionide.LanguageServerProtocol.Tests.Benchmarks.TypeCheckBenchmarks.*"` | Method | Categories | Mean | Error | StdDev | Ratio | Gen 0 | Allocated | |---------------------- |----------- |------------:|----------:|----------:|------:|-------:|----------:| | IsNumeric_typeof | IsNumeric | 1,471.05 ns | 10.140 ns | 8.989 ns | 1.00 | 0.0687 | 576 B | | IsNumeric_hash | IsNumeric | 149.35 ns | 0.467 ns | 0.414 ns | 0.10 | - | - | | | | | | | | | | | IsBool_typeof | IsBool | 361.30 ns | 2.577 ns | 2.411 ns | 1.00 | - | - | | IsBool_hash | IsBool | 81.31 ns | 0.229 ns | 0.203 ns | 0.22 | - | - | | | | | | | | | | | IsString_typeof | IsString | 365.60 ns | 2.242 ns | 2.097 ns | 1.00 | - | - | | IsString_hash | IsString | 76.89 ns | 0.151 ns | 0.134 ns | 0.21 | - | - | | | | | | | | | | | IsOption_check | IsOption | 934.31 ns | 18.047 ns | 19.310 ns | 1.00 | - | - | | IsOption_memoriseType | IsOption | 513.97 ns | 3.342 ns | 3.126 ns | 0.55 | 0.1831 | 1,536 B | | IsOption_memoriseHash | IsOption | 208.99 ns | 1.273 ns | 1.191 ns | 0.22 | - | - |
Was done for Option handling (`option` -> optional, otherwise required) Now: Existing `JsonProperty.Required` has precedence over `option` Fix: Mutable Fields in Record get serialized twice
Change `FormattingOptions` to Record Changes: * Record instead of normal class (Fits better with other types -- which are all records) * All properties (except `AdditionalData`) are now immutable * Use `IDictionary` for `AdditionalData` -> can use `Map` when created in F#, `Newtonsoft.Json` can deserialize to `Dictionary` * Ensure `AdditionalData` isn't null after deserializing: When no additional properties in json, `AdditionalData` `Newtonsoft.Json` doesn't touch `AdditionalData` -> is null * Solution: Set `AdditionalData` after deserializing to `Map.empty` when `null` * Note: Requires `AdditionalData` to be mutable (but vs. before: only `AdditionalData` must be mutable, not other properties too) General template for records with `JsonExtensionData`: ```fsharp type MyRecord = { // normal properties just like in every other record Value: int // Dictionary with json properties without explicit matching field here in Record [<JsonExtensionData>] mutable member AdditionalData: IDictionary<string, JToken> } [<OnDeserialized>] member o.OnDeserialized(context: StreamingContext) = // Ensure `AdditionalData` isn't null if isNull o.AdditionalData then o.AdditionalData <- Map.empty ```
specs: `(Command | CodeAction)[]`, but here was `Command[] | CodeAction[]`
Random testing with FsCheck of all LSP types. Tests: `(o |> serialize |> deserialize) = o` Note: Success just means object can do roundtrip to json and back again without exceptions. It doesn't mean it gets serialized to the correct json representation! And additional: is random testing -> there might be still a case that throws exception, but didn't get tested. And of course: not real-world-looking data Note: New LSP types get automatically picked up when added to `Ionide.LanguageServerProtocol.Types` (`./src/Types.fs`). Exception: Type Abbreviations get erased and cannot be picked up at runtime via reflection -> Must be manual added to tests (in `tests\Shotgun.fs`, start of `tests`) Note: Checking new type usually just work. But sometimes it requires some additional handling: * FsCheck cannot generate type (for example recursive types) -> requires custom Generator (see `Gen` in `tests\Shotgun.fs`) * Type doesn't use structural equality -> `Expecto.equal` fails even when same content (for example case with `Dictionary` (in field with `JsonExtensionDataAttribute`) -> Requires replacing all Dictionaries with `Map` with same content for comparison) * Best stick to "normal" F# types, lists, arrays, `Map`s
| Method | Categories | count | Mean | Error | StdDev | Gen 0 | Gen 1 | Allocated | |------------------- |----------- |------ |-------------:|----------:|----------:|----------:|----------:|----------:| | AllLsp_Roundtrips | LSP | 1 | 293.9 us | 0.56 us | 0.52 us | 11.7188 | 0.9766 | 98 KB | | AllLsp_Roundtrips | LSP | 250 | 72,627.1 us | 200.79 us | 187.82 us | 2857.1429 | 142.8571 | 24,469 KB | | | | | | | | | | | | Example_Roundtrips | Example | 1 | 6,562.9 us | 11.17 us | 10.45 us | 140.6250 | 31.2500 | 1,184 KB | | Example_Roundtrips | Example | 50 | 329,601.5 us | 861.08 us | 719.04 us | 7000.0000 | 1000.0000 | 59,178 KB | Note: Not comparable with prev benchmarks any more: * Number of Roundtrips is now specified as arg -> always a loop * Number of iterations is now 1 time less than before (`0..3` -> `0;1;2;3` -> upper bound is inclusive -> previously `251` iterations instead of `250`)
No real improvement. But other locations use same check -> align | Method | Categories | count | Mean | Error | StdDev | Gen 0 | Gen 1 | Allocated | |------------------- |----------- |------ |-------------:|------------:|----------:|----------:|----------:|----------:| | AllLsp_Roundtrips | Pre | 1 | 295.6 us | 0.46 us | 0.43 us | 11.7188 | 0.9766 | 98 KB | | AllLsp_Roundtrips | Pre | 250 | 74,286.7 us | 155.12 us | 129.53 us | 2857.1429 | 285.7143 | 24,469 KB | | | | | | | | | | | | Example_Roundtrips | Pre | 1 | 6,662.9 us | 22.01 us | 20.59 us | 140.6250 | 31.2500 | 1,184 KB | | Example_Roundtrips | Pre | 50 | 330,380.3 us | 1,009.86 us | 944.62 us | 7000.0000 | 1000.0000 | 59,177 KB | |------------------- |----------- |------ |-------------:|------------:|----------:|----------:|----------:|----------:| | AllLsp_Roundtrips | Post | 1 | 292.2 us | 0.38 us | 0.36 us | 11.7188 | 0.9766 | 98 KB | | AllLsp_Roundtrips | Post | 250 | 72,117.0 us | 340.64 us | 318.63 us | 2857.1429 | 285.7143 | 24,469 KB | | | | | | | | | | | | Example_Roundtrips | Post | 1 | 6,662.9 us | 19.21 us | 17.97 us | 140.6250 | 31.2500 | 1,184 KB | | Example_Roundtrips | Post | 50 | 330,835.3 us | 981.83 us | 870.37 us | 7000.0000 | 1000.0000 | 59,177 KB |
Only slight improvement. But `_ option` is extremely often used -> even tiny enhancement can be beneficial during a FSAC session | Method | Categories | count | Mean | Error | StdDev | Gen 0 | Allocated | |------------------ |----------- |------ |------------:|----------:|----------:|----------:|----------:| | Option_Roundtrips | Pre | 50 | 306.8 us | 1.35 us | 1.27 us | 12.6953 | 104 KB | | Option_Roundtrips | Pre | 1000 | 6,118.4 us | 14.63 us | 12.97 us | 250.0000 | 2,086 KB | | Option_Roundtrips | Pre | 10000 | 61,710.5 us | 244.39 us | 228.61 us | 2444.4444 | 20,859 KB | |------------------ |----------- |------ |------------:|----------:|----------:|----------:|----------:| | Option_Roundtrips | Post | 50 | 282.0 us | 1.80 us | 1.68 us | 11.7188 | 100 KB | | Option_Roundtrips | Post | 1000 | 5,640.2 us | 42.15 us | 39.42 us | 242.1875 | 1,992 KB | | Option_Roundtrips | Post | 10000 | 55,811.9 us | 82.34 us | 64.29 us | 2333.3333 | 19,922 KB | All with change: | Method | Categories | count | Mean | Error | StdDev | Gen 0 | Gen 1 | Allocated | |------------------- |----------- |------ |-------------:|----------:|----------:|----------:|----------:|----------:| | AllLsp_Roundtrips | LSP | 1 | 293.2 us | 1.74 us | 1.63 us | 11.7188 | 0.9766 | 97 KB | | AllLsp_Roundtrips | LSP | 250 | 71,404.6 us | 308.36 us | 288.44 us | 2875.0000 | 250.0000 | 24,320 KB | | | | | | | | | | | | Example_Roundtrips | Example | 1 | 6,533.9 us | 10.28 us | 9.61 us | 140.6250 | 31.2500 | 1,183 KB | | Example_Roundtrips | Example | 50 | 330,732.2 us | 535.16 us | 446.88 us | 7000.0000 | 1000.0000 | 59,161 KB | | | | | | | | | | | | Option_Roundtrips | Basic | 50 | 281.7 us | 1.21 us | 1.13 us | 11.7188 | - | 100 KB | | Option_Roundtrips | Basic | 1000 | 5,687.0 us | 41.07 us | 38.42 us | 242.1875 | - | 1,992 KB | | Option_Roundtrips | Basic | 10000 | 55,379.6 us | 174.32 us | 145.57 us | 2333.3333 | - | 19,922 KB |
At this point it's basically just playing around and experimenting -- and not really constructive...
Ok, done (I think...) Note: I changed some LSP types (see commits) -> Clients might need slight adjustments I added some Benchmarks to play around and experiment with. But they are basically (at least partially) just way-too-complex, not-really-used, distracting clutter. But they are in a separate file and don't bother -- so I kept them. Might be useful again when changing Converters. One test enhancement: There are now quite few tests -- and I think especially these new LSP de/serialization tests are worth running on every change. |
Yes, absolutely! |
Tests added Though just as |
Overall this looks quite good, but I'd like to change it to be |
Is it ok if I squash this? |
yes, of course |
Currently
ErasedUnionConverter
doesn't implementReadJson
-> ErasedUnions cannot be deserialized from json.But this is needed for
inlayHint/resolve
: input and output isInlayHint
->InlayHint
must be serializable as well as deserializable.This PR implements
ReadJson
forErasedUnionConverter
-> enables deserialization of Erased Unions (and as a result deserialization ofInlayHint
) :But note: This only works when types in each Union case are distinguishable from each other.
For example:
U<float, int>
-> always deserializes to first Case because a number always matches float.There are some additional deserialization enhancements to match correct cases:
Stricter basic type deserializing
By default Newtonsoft.Json is rather lax when deserializing -> types must not match exactly. For example:
"42"
deserializes toint
, and42
deserializes successful tostring
.This is an issue with Erased Union like
U2<int, string>
->"42"
gets deserialized asU2.First 42
instead ofU2.Second "42"
This PR adds extra converters for
string
,bool
and numbers so these cases get deserialized correctly.Proper handling of optional & required fields in Records
Currently all fields are optional in practice -> field missing in json still deserializes to object:
This works, but
r.Name
isnull
.That's especially an issue for matching Cases in Erased Unions: every json objects can be deserialized into any record. For example
{}
can be deserialized intoR
. And in Union:U2<R, {| MyValue: int |}>
->Second
never gets matched, because every json object fitsR
. So json{"myValue": 42}
gets deserialized toU2.First {R.Name=null; Value=0}
.I added a custom ContractResolver, which ensures something can only be deserialized when all required fields exists.
Option fields (
: _ option
) are optional and don't need to be present in json:->
U2<S, {|MyValue: int|}>
can be correctly deserializedNote: Cases in Erased Unions still must be distinguishable from each other:
U2<S, {| Name: string; Something: int |}>
still gets always deserialized toU2.First
(S
) becauseName
is required inS
whileValue
isn't -- and additional fields get ignored -> every json object with propertyname
can be deserialized toS
Note: I do this by overwriting
Required
property for contracts. That means: I currently also overwrite custom required annotations specified in Attribute for a field ([<JsonProperty(Required = ...)>]
) (Currently never used here in this Repo)-> Required and Optional is solely based on
option
.Update:
JsonProperty.Required
is now honored when specified and takes precedence overoption
Note: Doesn't work with
ValueOption
(Supportingvoption
would require custom deserialize for ValueOption -- which is surprisingly complex (compared toOption
(-> None=null -> handled by Newtonsoft))) ... and I didn't consider it necessary (currently not used in this repo))Note on speed of
ErasedUnionConverter
:The converter tries to deserializes the current json value to each case in order. When one case doesn't fit json, deserializing this case throws an exception, which the converter catches and tries the next case.
But exceptions are quite slow -> put most likely case first (
U2<LikelyUsed, UnlikelyUsed>
)And additional there's special handling for basic types (string, bool, numbers) -> get handled without exceptions -> put case with basic type first (
U2<int, MyRecord>
)Additional:
Add Tests
mostly targeting Converters and ContractResolver (-> Required/Optional). But not a complete test suite -- just the examples I used to implement this PR.
dotnet run --project tests
Add very simple benchmarks
Extremely basic benchmarks I used to measure speed (and allocations). Currently just serialize and deserialize back again of two large_ish (in terms of property count) LSP types (
InitializeParams
&InlayHint
).It's questionable how representative these two actually are (InitParams is just used once in LSP lifecycle...), but I wanted to test with something with a lot of properties that cover a couple of different deserialization cases (and
InlayHint
is the reason for this PR).--benchmarks
switch:dotnet run --project tests -c Release -- --benchmarks --filter "*"
Because of additional project:
Github Workflow files required updates to just process
./src
and ignore./tests
. I think I adjusted all case -- but be sure to check again!