Editorial: Use grammar for VLQ and Mappings and decode via SDO #180

szuend · 2025-03-10T12:36:36Z

Draft that changes VLQ and mappings to use grammar plus "syntax-directed operations" for decoding rather than a pure algorithmic approach.

Only uploaded so we have a preview for discussion.

Preview: https://szuend.github.io/source-map/branch/mappings-grammar/#sec-mappings

spec.emu

nicolo-ribaudo · 2025-03-10T13:16:02Z

I attempted to re-express the DecodeBase64VLQ operation without using an accumulator, since the way fields are mutated in the accumulator is not super easy to keep track of (recursion + mutable state). Do you think this version would work, or am I missing something about shifting the values?

And then you use it as "let value be the VLQSignedValue of Vlq".

VLQSignedValue ( )

Vlq :: VlqDigitList

Let unsigned be the VLQUnsignedValue of VlqDigitList.
If unsigned modulo 2 is 1, let sign be -1.
Else, let sign be 1.
Return sign × floor(unsigned / 2).

VLQUnsignedValue ( )

VlqDigitList :: DigitWithoutContinuationBit

Return DecodeBase64Digit(DigitWithoutContinuationBit).

VlqDigitList :: DigitWithContinuationBit VlqDigitList

Let left be DecodeBase64Digit(DigitWithContinuationBit).
Let right be the VLQUnsignedValue of VlqDigitList.
Return (right - 32) × 2⁵ + left.

szuend · 2025-03-10T13:28:54Z

Very nice! That looks much simpler. Although I think there is a bug: We need to slice the continuation bit off left and not off _right_:

3. Return _right_ × 2 ** 5 + (_left_ - 32).

Alternatively we could define DecodeBase64Digit as an SDO that handles the continuation bit.

nicolo-ribaudo · 2025-03-10T13:34:39Z

You are right, the - 32 is in the wrong place 👍

I think I have a slight preference for the SDO approach, but only with a name that implies "this isn't actually the base64-decoded value, but the base64-decoded value after trimming the VLQ continuation bit" :)

Or, we remove DecodeBase64Digit, and we just add cases for the two digit types directly in the VLQUnsignedValue SDO.

szuend · 2025-03-12T06:24:32Z

With tc39/ecmarkup#637 still in-flight, I applied Nicolos idea but via an additional nonterminal indirection. Should be good enough to discuss this in tonights' meeting.

spec.emu

nicolo-ribaudo

Second review pass. The VLQs look good now, and I finally reviewed the mappings definitions.

I would prefer this slightly different approach, based on the "informal" understanding that mappings contains one or more segments, one per line, separated by semicolons. A segment contains zero or more mappings, separated by commas. This avoids having to think about the "what if there are consecutive semicolons?" case, since they are just consecutive segments.

Mappings :
  Segment
  Segment `;` Mappings

Segment :
  MappingList?

MappingList :
  Mapping
  Mapping `,` MappingList

Note that the above definitions makes "mappings": "" valid, which matches the current spec.

DecodeMappingsSdo would then become
DecodeMappingsSdo ( ... )

Mappings : Segment ; Mappings
MappingList : Mapping , MappingList

For each child node child of this Parse Node, do ... (same as now)

Segment : MappingList?

Set state.[[GeneratedLine]] to state.[[GeneratedLine]] + 1.
Set state.[[GeneratedColumn]] to 0.
If MappingList is present, perform DecodeMappingsSdo of MappingList with arguments ...

Mapping : GeneratedColumn (same as now)
Mapping : GeneratedColumn OriginalSource OriginalLine OriginalColumn Name? (same as now)

spec.emu

nicolo-ribaudo · 2025-03-12T11:20:10Z

Maybe can we also rename Mappings to SegmentList, or MappingsSegmentList? It's a bit weird that "mappings" is a list of segments, and "segment" is a list of mappings, even though that's how we always referred to it 😅

szuend · 2025-03-12T11:44:18Z

Thanks for the new grammar, I love it!

Maybe can we also rename Mappings to SegmentList, or MappingsSegmentList? It's a bit weird that "mappings" is a list of segments, and "segment" is a list of mappings, even though that's how we always referred to it 😅

IMO it would be nice to have consistent naming of the goal symbols w.r.t. to how the field is named in the source map JSON. What about renaming Mappings to SegmentList but then add a goal symbol MappingsField : SegmentList?

nicolo-ribaudo · 2025-03-12T11:47:01Z

What about renaming Mappings to SegmentList but then add a goal symbol MappingsField : SegmentList?

Sounds good 👍

Co-authored-by: Nicolò Ribaudo <[email protected]>

szuend · 2025-03-12T12:23:49Z

Changed the grammar as per your suggestion. Also renamed the SDO to DecodeMappingsField, which makes more sense. The "Sdo" suffix was only there because I couldn't think of anything better at the time.

spec.emu

takikawa · 2025-03-13T23:52:36Z

spec.emu

+          1. Perform DecodeMappingsField of |OriginalSource| with arguments _state_, _mappings_, _names_ and _sources_.
+          1. Perform DecodeMappingsField of |OriginalLine| with arguments _state_, _mappings_, _names_ and _sources_.
+          1. Perform DecodeMappingsField of |OriginalColumn| with arguments _state_, _mappings_, _names_ and _sources_.
+          1. Let _source_ be _sources_[_state_.[[SourceIndex]]].


It's only in an error case, but there's a small semantic difference between the old version and this I think. If the source index is invalid, the old algorithm kept the source as null while this returns undefined in JS or out-of-bounds lookup in general. (original line/column are also kept null when they are negative)

Yeah there is some bound checks missing. Same for names.

I'm not sure about undefined though. We are using the List specification type here which does not actually spell out what is returned for out-of-bounds accesses, so it's even worse since its undefined behavior.

I'll add some bound checks.

I rewrote the algorithm here to do the actual validation and set the fields explicitly to null. PTAL.

I also filed #184. It seems some implementations (e.g. Chrome) behave as I wrote the spec initially but others (e.g. Firefox) behave as the existing spec says.

spec.emu

jridgewell · 2025-03-14T02:52:04Z

spec.emu

+          1. For each child node _child_ of this Parse Node, do
+            1. If _child_ is an instance of a nonterminal, then


I have no idea what this means.

I adapted this from the ECMAScript spec that has a couple instances of this (e.g. Contains).

Since we only have two productions, we could spell them also out explicitly:

<emu-grammar> SegmentList : Segment `;` SegmentList </emu-grammar> <emu-alg> 1. Perform DecodeMappingsField of |Segment| ... 1. Perform DecodeMappingsFIeld of |SegmentList| ... </eum-alg>

And same for the MappingList production.

jridgewell · 2025-03-14T02:59:16Z

spec.emu

+            MappingList?
+        </emu-grammar>
+        <emu-alg>
+          1. Set _state_.[[GeneratedLine]] to _state_.[[GeneratedLine]] + 1.


This means the GeneratedLine starts at 1 instead of 0? But originalLine starts at 0. And [[GeneratedLine]] is "a non-negative integer", implying 0 is valid and 1 is actually the second line.

You are right.

I think it might be sufficient to switch the statements around. Perform DeocdeMappingsField first, then increment the line.

Alternatively we could also move the increment to the SegemtnList production. I'd find that even clearer:

<emu-grammar> SegmentList : Segment `;` SegmentList </emu-grammar> <emu-alg> 1. Perform DecodeMappingsField on |Segment| ... 1. Set _state_.[[GeneratedLine]] to _state_.[[GeneratedLine]] + 1. 1. Set _state_.[[GeneratedColumn]] to 0. 1. Perform DecodeMappingsField on |SegmentList| ... </emu-alg>

@nicolo-ribaudo any preference?

spec.emu

Co-authored-by: Justin Ridgewell <[email protected]>

Editorial: Use grammar for VLQ and Mappings and decode via SDO

aeeea95

nicolo-ribaudo self-requested a review March 10, 2025 12:40

nicolo-ribaudo reviewed Mar 10, 2025

View reviewed changes

spec.emu Outdated Show resolved Hide resolved

Simplify VLQ decoding as proposed by Nicolo

578433f

nicolo-ribaudo reviewed Mar 12, 2025

View reviewed changes

spec.emu Outdated Show resolved Hide resolved

nicolo-ribaudo reviewed Mar 12, 2025

View reviewed changes

spec.emu Outdated Show resolved Hide resolved

nicolo-ribaudo reviewed Mar 12, 2025

View reviewed changes

spec.emu Outdated Show resolved Hide resolved

Add VLQ bounds checks

908f803

nicolo-ribaudo reviewed Mar 12, 2025

View reviewed changes

spec.emu Outdated Show resolved Hide resolved

spec.emu Outdated Show resolved Hide resolved

szuend and others added 2 commits March 12, 2025 13:02

Apply suggestions from code review

d779ecf

Co-authored-by: Nicolò Ribaudo <[email protected]>

Simplify mappings grammar

e79f562

szuend added 2 commits March 12, 2025 13:26

Merge remote-tracking branch 'origin/main' into mappings-grammar

cc88c19

Merge branch 'main' into mappings-grammar

f693f32

nicolo-ribaudo mentioned this pull request Mar 12, 2025

Spec bug: What happens with a g,C mapping? #181

Open

szuend mentioned this pull request Mar 13, 2025

Editorial: add a 'decode a list of base64 VLQ' algorithm #130

Closed

takikawa reviewed Mar 13, 2025

View reviewed changes

spec.emu Show resolved Hide resolved

takikawa reviewed Mar 13, 2025

View reviewed changes

spec.emu Outdated Show resolved Hide resolved

takikawa reviewed Mar 13, 2025

View reviewed changes

jridgewell approved these changes Mar 14, 2025

View reviewed changes

szuend and others added 3 commits March 14, 2025 09:25

Add missing dot.

37cd337

Co-authored-by: Justin Ridgewell <[email protected]>

Fix typos found bye reviewers

1ac62b4

Set invalid mapping fields explicitly to *null*

7e531e0

Fix unsigned value truncation to 32 bits.

7ef5a57

Co-authored-by: Justin Ridgewell <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Editorial: Use grammar for VLQ and Mappings and decode via SDO #180

Editorial: Use grammar for VLQ and Mappings and decode via SDO #180

szuend commented Mar 10, 2025 •

edited by nicolo-ribaudo

Loading

nicolo-ribaudo commented Mar 10, 2025 •

edited

Loading

szuend commented Mar 10, 2025

nicolo-ribaudo commented Mar 10, 2025 •

edited

Loading

szuend commented Mar 12, 2025 •

edited

Loading

nicolo-ribaudo left a comment

nicolo-ribaudo commented Mar 12, 2025 •

edited

Loading

szuend commented Mar 12, 2025

nicolo-ribaudo commented Mar 12, 2025

szuend commented Mar 12, 2025

takikawa Mar 13, 2025

szuend Mar 14, 2025

szuend Mar 14, 2025

szuend Mar 14, 2025

jridgewell Mar 14, 2025

szuend Mar 14, 2025

jridgewell Mar 14, 2025

szuend Mar 14, 2025

		1. For each child node _child_ of this Parse Node, do
		1. If _child_ is an instance of a nonterminal, then

Editorial: Use grammar for VLQ and Mappings and decode via SDO #180

Are you sure you want to change the base?

Editorial: Use grammar for VLQ and Mappings and decode via SDO #180

Conversation

szuend commented Mar 10, 2025 • edited by nicolo-ribaudo Loading

nicolo-ribaudo commented Mar 10, 2025 • edited Loading

szuend commented Mar 10, 2025

nicolo-ribaudo commented Mar 10, 2025 • edited Loading

szuend commented Mar 12, 2025 • edited Loading

nicolo-ribaudo left a comment

Choose a reason for hiding this comment

nicolo-ribaudo commented Mar 12, 2025 • edited Loading

szuend commented Mar 12, 2025

nicolo-ribaudo commented Mar 12, 2025

szuend commented Mar 12, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

szuend commented Mar 10, 2025 •

edited by nicolo-ribaudo

Loading

nicolo-ribaudo commented Mar 10, 2025 •

edited

Loading

nicolo-ribaudo commented Mar 10, 2025 •

edited

Loading

szuend commented Mar 12, 2025 •

edited

Loading

nicolo-ribaudo commented Mar 12, 2025 •

edited

Loading