-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add @injection.shebang
for setting the language in injections.scm
#26939
Comments
I think that's way too niche. What would be the benefit of having this as a general "magic capture" rather than an explicit, language-specific directive (like we already support)? |
It's strictly for convenience and reusability for any file types with embedded code. Parsing the shebang to extract the right information can be tricky, and it doesn't make much sense to repeat its logic and internal grammar across a variety of file formats. Conversely, it is easy to find a simple But yes, it is just a convenience. Maybe it would be better to have a regex |
This just shifts the burden from the language maintainers to the Neovim devs without any net reduction or performance gain. Also note that there could be variations among file formats, so the generality is dubious. (For example, TeX has a very different "shebang".)
We already have that: |
Fair enough, thanks for the feedback
Perhaps nvim-treesitter/nvim-treesitter#3944 will help with this at some point |
In lieu of this, does anyone have an idea for how to reliably extract the executable from a shebang in TS? I have tried various variants of this slightly messy grammar: shebang: ($) =>
prec.left(
seq(
choice($.shebang_executable, field("unrecognized_shebang", /#!.*/)),
optional($._newline),
),
),
shebang_executable: ($) =>
token.immediate(seq(
"#!",
/\S*[/ ]/,
field("cmd", /\S+/),
/.*/,
)), Which is an attempt to extract the capture group (shebang_executable
cmd: _) since TS complains that My fallback is to generate two different |
Problem
Parsing shebangs in tree-sitter is possible, but extremely difficult to get correct without a fairly complex external scanner. Being able to do this is useful for nested languages if a direct injection specifier isn't available.
Expected behavior
Helix provides an
@injection.shebang
capture that gets parsed by the editor and the language extracted. Adding this to treesitter would be great!Docs: https://docs.helix-editor.com/guides/injection.html#capture-types and related discussions there helix-editor/helix#3970
Based on a quick search, they use this for Nix, Markdown (as a fallback), and typst.
Adding a field to
parsers.lua
for common shebanged languages, but a simple fallback that extracts#!\S*bin\S*[/ ](?P<lang>[^-'"]\S*)
would probably work for most casesI originally opened this in nvim-treesitter but I learned that this is the correct repo.
I also opened an issue for upstream tree-sitter: tree-sitter/tree-sitter#2851
The text was updated successfully, but these errors were encountered: