-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Editorial: Use grammar for VLQ and Mappings and decode via SDO #180
base: main
Are you sure you want to change the base?
Conversation
I attempted to re-express the DecodeBase64VLQ operation without using an accumulator, since the way fields are mutated in the accumulator is not super easy to keep track of (recursion + mutable state). Do you think this version would work, or am I missing something about shifting the values? And then you use it as "let value be the VLQSignedValue of Vlq". VLQSignedValue ( ) Vlq :: VlqDigitList
VLQUnsignedValue ( ) VlqDigitList :: DigitWithoutContinuationBit
VlqDigitList :: DigitWithContinuationBit VlqDigitList
|
Very nice! That looks much simpler. Although I think there is a bug: We need to slice the continuation bit off left and not off
Alternatively we could define |
You are right, the I think I have a slight preference for the SDO approach, but only with a name that implies "this isn't actually the base64-decoded value, but the base64-decoded value after trimming the VLQ continuation bit" :) Or, we remove |
With tc39/ecmarkup#637 still in-flight, I applied Nicolos idea but via an additional nonterminal indirection. Should be good enough to discuss this in tonights' meeting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Second review pass. The VLQs look good now, and I finally reviewed the mappings definitions.
I would prefer this slightly different approach, based on the "informal" understanding that mappings contains one or more segments, one per line, separated by semicolons. A segment contains zero or more mappings, separated by commas. This avoids having to think about the "what if there are consecutive semicolons?" case, since they are just consecutive segments.
Mappings :
Segment
Segment `;` Mappings
Segment :
MappingList?
MappingList :
Mapping
Mapping `,` MappingList
Note that the above definitions makes "mappings": ""
valid, which matches the current spec.
DecodeMappingsSdo would then become
DecodeMappingsSdo ( ... )
Mappings : Segment ;
Mappings
MappingList : Mapping ,
MappingList
- For each child node child of this Parse Node, do ... (same as now)
Segment : MappingList?
- Set state.[[GeneratedLine]] to state.[[GeneratedLine]] + 1.
- Set state.[[GeneratedColumn]] to 0.
- If MappingList is present, perform DecodeMappingsSdo of MappingList with arguments ...
Mapping : GeneratedColumn (same as now)
Mapping : GeneratedColumn OriginalSource OriginalLine OriginalColumn Name? (same as now)
Maybe can we also rename Mappings to SegmentList, or MappingsSegmentList? It's a bit weird that "mappings" is a list of segments, and "segment" is a list of mappings, even though that's how we always referred to it 😅 |
Thanks for the new grammar, I love it!
IMO it would be nice to have consistent naming of the goal symbols w.r.t. to how the field is named in the source map JSON. What about renaming Mappings to SegmentList but then add a goal symbol MappingsField : SegmentList? |
Sounds good 👍 |
Co-authored-by: Nicolò Ribaudo <[email protected]>
Changed the grammar as per your suggestion. Also renamed the SDO to |
spec.emu
Outdated
1. Perform DecodeMappingsField of |OriginalSource| with arguments _state_, _mappings_, _names_ and _sources_. | ||
1. Perform DecodeMappingsField of |OriginalLine| with arguments _state_, _mappings_, _names_ and _sources_. | ||
1. Perform DecodeMappingsField of |OriginalColumn| with arguments _state_, _mappings_, _names_ and _sources_. | ||
1. Let _source_ be _sources_[_state_.[[SourceIndex]]]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's only in an error case, but there's a small semantic difference between the old version and this I think. If the source index is invalid, the old algorithm kept the source as null
while this returns undefined
in JS or out-of-bounds lookup in general. (original line/column are also kept null
when they are negative)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah there is some bound checks missing. Same for names
.
I'm not sure about undefined
though. We are using the List specification type here which does not actually spell out what is returned for out-of-bounds accesses, so it's even worse since its undefined behavior.
I'll add some bound checks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rewrote the algorithm here to do the actual validation and set the fields explicitly to null. PTAL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also filed #184. It seems some implementations (e.g. Chrome) behave as I wrote the spec initially but others (e.g. Firefox) behave as the existing spec says.
1. For each child node _child_ of this Parse Node, do | ||
1. If _child_ is an instance of a nonterminal, then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea what this means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I adapted this from the ECMAScript spec that has a couple instances of this (e.g. Contains).
Since we only have two productions, we could spell them also out explicitly:
<emu-grammar>
SegmentList :
Segment `;` SegmentList
</emu-grammar>
<emu-alg>
1. Perform DecodeMappingsField of |Segment| ...
1. Perform DecodeMappingsFIeld of |SegmentList| ...
</eum-alg>
And same for the MappingList
production.
MappingList? | ||
</emu-grammar> | ||
<emu-alg> | ||
1. Set _state_.[[GeneratedLine]] to _state_.[[GeneratedLine]] + 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means the GeneratedLine
starts at 1 instead of 0? But originalLine
starts at 0. And [[GeneratedLine]]
is "a non-negative integer", implying 0 is valid and 1 is actually the second line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right.
I think it might be sufficient to switch the statements around. Perform DeocdeMappingsField
first, then increment the line.
Alternatively we could also move the increment to the SegemtnList
production. I'd find that even clearer:
<emu-grammar>
SegmentList :
Segment `;` SegmentList
</emu-grammar>
<emu-alg>
1. Perform DecodeMappingsField on |Segment| ...
1. Set _state_.[[GeneratedLine]] to _state_.[[GeneratedLine]] + 1.
1. Set _state_.[[GeneratedColumn]] to 0.
1. Perform DecodeMappingsField on |SegmentList| ...
</emu-alg>
@nicolo-ribaudo any preference?
Co-authored-by: Justin Ridgewell <[email protected]>
Co-authored-by: Justin Ridgewell <[email protected]>
Draft that changes VLQ and mappings to use grammar plus "syntax-directed operations" for decoding rather than a pure algorithmic approach.
Only uploaded so we have a preview for discussion.
Preview: https://szuend.github.io/source-map/branch/mappings-grammar/#sec-mappings