-
-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parse expressions outside of functions #20
Conversation
Sorry for my delayed response! Generally speaking, I'd like to follow the example of the first-party tree-sitter grammars, as they're probably the best "documentation" for tree-sitter grammars. As far as those go, Ruby, JavaScript, and Elixir all actually allow top-level expressions. I think I'll have to look at Rust and/or Go to see if either A) top level expressions are allowed in the language (I don't think so?) and B) if the tree-sitter grammar allows top-level expressions. This is on my to-do list 😅
At some point I thought it was a good idea to test the error cases to ensure that the errors made sense or were helpful. Unfortunately tree-sitter gives us very little control over this, and it's probably a bad idea after all. All this to say, that test (or tests) should be fine to remove 👍 |
I did some looking around and it looks like
For rust it looks like tree-sitter-rust is permissive about top-level expressions where the rust language is not. E.g. let x = 2 + 3;
fn main() {
println!("Hello, world!");
} gives an error
but is parsed successfully by tree-sitter-rust...
I don't have a compiler tool-chain setup for go though so I'm not sure about that one 😅 |
This change allows the parser to return valid nodes for expressions on the "top-level" of a document. Here "top-level" is read as "not within a function." This is actually invalid Gleam code: for example, you cannot write a `case/2` statement outside of a function body. This is desirable for the tree-sitter parser, though, because the parser will end up being used in flexible situations, such as one-off highlights in fenced markdown blocks, e.g.: ```gleam <<code:int-size(8)-unit(2), reason:utf8>> ``` Which is a common usage in an editor, or on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me! 🎉 I'm leaving this open for the moment in case you just want to remove those error tests which are likely not useful 😁
test/corpus/imports.txt
Outdated
module: (module) | ||
(ERROR) | ||
imports: (unqualified_imports | ||
(unqualified_import | ||
name: (identifier)) | ||
(unqualified_import | ||
name: (identifier))))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably just remove this test 😁
test/corpus/imports.txt
Outdated
(ERROR) | ||
(record | ||
name: (type_identifier)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one also 😅
I wish there were a way to test that bad syntax errors with more stability. I know |
This PR is actually more of a question than a solution 😅
Parsing expressions in the top-level (under source_file, outside of functions where they would normally be parsed) is advantageous when injecting Gleam, as you might do with markdown. This is pertinent to #14 because as far as I know, GitHub uses the tree-sitter parser for markdown's fenced code blocks. For example in #19, I have a code block
Which is not a valid gleam program but is a valid gleam "snippet" so to speak.
What would you think about allowing expressions in the top-level like this?
(I didn't really look at those changed import tests yet, looks like some error nodes made their way into the tests and this PR ends up changing the error behavior.)