Skip to content
This repository has been archived by the owner on Aug 31, 2023. It is now read-only.

LosslessTreeSinker attaching trivia to tokens #1801

Merged
merged 21 commits into from
Nov 22, 2021

Conversation

xunilrj
Copy link
Contributor

@xunilrj xunilrj commented Nov 18, 2021

part of: #1720
Improved changes of #1738

Tasks

  • LosslessTreeSink
  • LossyTreeSink

I also needed to make some formatter changes.

  • Trim leading and trailing trivia from SynyaxNode with error. See if_stmt_err.js

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Nov 18, 2021

Deploying with  Cloudflare Pages  Cloudflare Pages

Latest commit: f7406ca
Status: ✅  Deploy successful!
Preview URL: https://7fe15397.tools-8rn.pages.dev

View logs

@github-actions
Copy link

Test262 comparison coverage results on ubuntu-latest

Test result main count This PR count Difference
Total 17608 17608 0
Passed 16771 16771 0
Failed 836 836 0
Panics 1 1 0
Coverage 95.25% 95.25% 0.00%

@github-actions
Copy link

Test262 comparison coverage results on windows-latest

Test result main count This PR count Difference
Total 17608 17608 0
Passed 16771 16771 0
Failed 836 836 0
Panics 1 1 0
Coverage 95.25% 95.25% 0.00%

@xunilrj xunilrj mentioned this pull request Nov 18, 2021
4 tasks
@xunilrj xunilrj requested a review from ematipico November 18, 2021 14:32
@MichaReiser
Copy link
Contributor

Can we delete these now?

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct Whitespace {
pub(crate) syntax: SyntaxToken,
}
impl std::fmt::Display for Whitespace {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
std::fmt::Display::fmt(&self.syntax, f)
}
}
impl AstToken for Whitespace {
fn can_cast(kind: SyntaxKind) -> bool { kind == WHITESPACE }
fn cast(syntax: SyntaxToken) -> Option<Self> {
if Self::can_cast(syntax.kind()) {
Some(Self { syntax })
} else {
None
}
}
fn syntax(&self) -> &SyntaxToken { &self.syntax }
}
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct Comment {
pub(crate) syntax: SyntaxToken,
}
impl std::fmt::Display for Comment {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
std::fmt::Display::fmt(&self.syntax, f)
}
}
impl AstToken for Comment {
fn can_cast(kind: SyntaxKind) -> bool { kind == COMMENT }
fn cast(syntax: SyntaxToken) -> Option<Self> {
if Self::can_cast(syntax.kind()) {
Some(Self { syntax })
} else {
None
}
}
fn syntax(&self) -> &SyntaxToken { &self.syntax }
}

@xunilrj
Copy link
Contributor Author

xunilrj commented Nov 18, 2021

Can we delete these now?

I think we can. I will create another PR to clean everything we don't need anymore.

Copy link
Contributor

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really love this change, excellent work!

There are two last things that I think we should improve before moving forward.

Debug Format:

I find it a bit misleading. It looks as if an identifier was parsed as foo or keywords/punctuations as \n!.

Would it make sense to format the trivia more explicitly, for example

[email protected] text: (trivia) (text="foo") (trivia whitespace: " ")

Swift actually prints tokens more like nodes since they now can contain rather bulky content like comments

[email protected] 
  (trivia newline "\n")
  (trivia whitespace "    ")
  (text: "foo")
  (trivia whitespace: " ")

Printing the trivia pieces one by one has the benefit that changes to the trivia parsing are tested as well. In that spirit. Would you mind adding an integration test for some trivia (single line comment, mutliline comment etc.)

EOF

I believe we shoulnd't append the EOF token conditionally, only when needed. Authors would see, oh, there's an EOF token I can test for and then bum, it isn't there in some cases.

We should also make it clear where EOF tokens can appear in our grammar. I think it's sufficient to add it to the JsRoot rule for now. We can figure out the parse_text use case later (probably best to introduce a "fake" node for parsed text.

Comment on lines 152 to 154
if self.needs_eof && !self.next_token_leading_trivia.1.is_empty() {
self.do_token(SyntaxKind::EOF, 0.into());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand this correctly that the sink will insert the EOF only if there was no EOF event coming from the parser? So all files will always have an EOF token?

We need to make sure to update our grammar to reflect this (another reason why I believe it shouldn't be conditional). Can you add the EOF to the JsRoot for now. We need to think through what that means when we parse expressions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am forcing a EOF at the tree sink for now. This way is easier and we never loose the last trivia of the file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This format is a bit misleading. It reads as if the identifier name is foo (with trailing space). Would it make sense to format the trivia more explicitly, for example

[email protected] text: (trivia) (text="foo") (trivia whitespace: " ")

Printing the trivia pieces one by one has the benefit that changes to the trivia parsing are tested as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about now?

[email protected]
        [email protected] "let" [Comments("/**/")] [Whitespace(" ")]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. It might be helpful to be a bit more descriptive, especially because the leading/text are now out of order

        [email protected] "let" (leading: COMMENTS("/**/")) (trailing: WHITESPACE(" "))

I would be OK not printing the trivia if it's empty.

Anyhow, that's more my personal taste now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we see the EOF token as part of the test files?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and now you can see.

@xunilrj xunilrj force-pushed the feature/treesinkers-using-attached-trivia branch from ff20f0a to 6fecb78 Compare November 20, 2021 17:34
Copy link
Contributor

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Can we remove the hidden from EOF.

#[doc(hidden)]
EOF,

As this is now a public visible token.

@xunilrj xunilrj merged commit 4045fa9 into main Nov 22, 2021
@xunilrj xunilrj deleted the feature/treesinkers-using-attached-trivia branch November 22, 2021 09:22
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants