Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update code to work with new embed syntax #124

Merged
merged 18 commits into from
Dec 9, 2017
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ serde_json = "1.0"
rayon = "0.8.0"
regex = "0.2.1"
getopts = "0.2"
pretty_assertions = "0.4.0"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approve 👍


[features]

Expand Down
Binary file modified assets/default_newlines.packdump
Binary file not shown.
Binary file modified assets/default_nonewlines.packdump
Binary file not shown.
3 changes: 3 additions & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,9 @@ extern crate serde;
#[macro_use]
extern crate serde_derive;
extern crate serde_json;
#[cfg(test)]
#[macro_use]
extern crate pretty_assertions;
pub mod highlighting;
pub mod parsing;
pub mod util;
Expand Down
46 changes: 44 additions & 2 deletions src/parsing/parser.rs
Original file line number Diff line number Diff line change
Expand Up @@ -451,8 +451,26 @@ impl ParseState {
MatchOperation::None => return false,
};
for (i, r) in ctx_refs.iter().enumerate() {
let proto = if i == 0 {
pat.with_prototype.clone()
let proto = if i == ctx_refs.len() - 1 { // https://forum.sublimetext.com/t/dev-build-3111/19240/17
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have time to fully review this yet (I will soon), but I just want to mention that this big deeply nested block looks scary. The link to the forum post is definitely helpful, but if there's some way you could make this less scary that would be great, or at least comment it more heavily. Maybe explain how it is implementing the behaviour described in that post in terms of syntect's data structures.

if pat.with_prototype.is_some() {
let p = pat.with_prototype.clone().unwrap();
{
let mut b = p.borrow_mut();
for pattern in b.patterns.iter_mut() {
match pattern {
&mut Pattern::Match(ref mut match_pat) => {
if match_pat.has_captures {
match_pat.regex = Some(match_pat.compile_with_refs(regions, line));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this doesn't work properly if there's recursion that allows the same with_context to be pushed twice on the stack, when the top one is popped it will still have the old captures compiled into it. This may never happen in practice with a with_prototype that has captures but it's possible in principle.

I'm also looking at this code again and thinking that I might have really screwed up the implementation of with_prototype and that I can redo it completely to be cleaner.

Don't try and fix this yet, I want to take a crack at simplifying a bunch of the related code, which may just outdate whatever fix you try.

}
},
&mut Pattern::Include(_) => {}, // TODO: included contexts can also contain backrefs in their match patterns - see https://github.com/trishume/syntect/issues/104
}
}
}
Some(p)
} else {
None
}
} else {
None
};
Expand Down Expand Up @@ -719,4 +737,28 @@ contexts:
debug_print_ops(line, &ops);
ops
}

#[test]
fn can_parse_issue120() {
let ps = SyntaxSet::load_from_folder("testdata").unwrap();
let syntax = ps.find_syntax_by_name("Embed_Escape Used by tests in src/parsing/parser.rs").unwrap();

let line1 = "\"abctest\" foobar";
let expect1 = [
"<meta.attribute-with-value.style.html>, <string.quoted.double>, <punctuation.definition.string.begin.html>",
"<meta.attribute-with-value.style.html>, <source.css>",
"<meta.attribute-with-value.style.html>, <string.quoted.double>, <punctuation.definition.string.end.html>",
"<meta.attribute-with-value.style.html>, <source.css>, <test.embedded>",
"<top-level.test>",
];
expect_scope_stacks_with_syntax(&line1, &expect1, syntax.to_owned());

let line2 = ">abctest</style>foobar";
let expect2 = [
"<meta.tag.style.begin.html>, <punctuation.definition.tag.end.html>",
"<source.css.embedded.html>, <test.embedded>",
"<top-level.test>",
];
expect_scope_stacks_with_syntax(&line2, &expect2, syntax.to_owned());
}
}
14 changes: 14 additions & 0 deletions src/parsing/syntax_definition.rs
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,20 @@ pub struct Context {
pub patterns: Vec<Pattern>,
}

impl Context {
pub fn new(meta_include_prototype: bool) -> Context {
Context {
meta_scope: Vec::new(),
meta_content_scope: Vec::new(),
meta_include_prototype: meta_include_prototype,
clear_scopes: None,
uses_backrefs: false,
patterns: Vec::new(),
prototype: None,
}
}
}

#[derive(Debug, Eq, PartialEq, Serialize, Deserialize)]
pub enum Pattern {
Match(MatchPattern),
Expand Down
1 change: 0 additions & 1 deletion src/parsing/syntax_set.rs
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,6 @@ impl SyntaxSet {
for entry in WalkDir::new(folder).sort_by(|a, b| a.cmp(b)) {
let entry = entry.map_err(LoadingError::WalkDir)?;
if entry.path().extension().map_or(false, |e| e == "sublime-syntax") {
// println!("{}", entry.path().display());
let syntax = load_syntax_file(entry.path(), lines_include_newline)?;
if let Some(path_str) = entry.path().to_str() {
self.path_syntaxes.push((path_str.to_string(), self.syntaxes.len()));
Expand Down
178 changes: 165 additions & 13 deletions src/parsing/yaml_load.rs
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ use std::ops::DerefMut;
pub enum ParseSyntaxError {
/// Invalid YAML file syntax, or at least something yaml_rust can't handle
InvalidYaml(ScanError),
/// The file must contain at least on YAML document
/// The file must contain at least one YAML document
EmptyFile,
/// Some keys are required for something to be a valid `.sublime-syntax`
MissingMandatoryKey(&'static str),
Expand Down Expand Up @@ -152,15 +152,8 @@ impl SyntaxDefinition {
state: &mut ParserState,
is_prototype: bool)
-> Result<ContextPtr, ParseSyntaxError> {
let mut context = Context {
meta_scope: Vec::new(),
meta_content_scope: Vec::new(),
meta_include_prototype: !is_prototype,
clear_scopes: None,
uses_backrefs: false,
patterns: Vec::new(),
prototype: None,
};
let mut context = Context::new(!is_prototype);

for y in vec.iter() {
let map = y.as_hash().ok_or(ParseSyntaxError::TypeMismatch)?;

Expand Down Expand Up @@ -265,12 +258,13 @@ impl SyntaxDefinition {
.map(|s| str_to_scopes(s, state.scope_repo))
.unwrap_or_else(|| Ok(vec![]))?;


let captures = if let Ok(map) = get_key(map, "captures", |x| x.as_hash()) {
let mut res_map = Vec::new();
for (key, value) in map.iter() {
if let (Some(key_int), Some(val_str)) = (key.as_i64(), value.as_str()) {
res_map.push((key_int as usize,
str_to_scopes(val_str, state.scope_repo)?));
str_to_scopes(val_str, state.scope_repo)?));
}
}
Some(res_map)
Expand All @@ -287,13 +281,59 @@ impl SyntaxDefinition {
MatchOperation::Push(SyntaxDefinition::parse_pushargs(y, state)?)
} else if let Ok(y) = get_key(map, "set", Some) {
MatchOperation::Set(SyntaxDefinition::parse_pushargs(y, state)?)
} else if let Ok(y) = get_key(map, "embed", Some) {
let mut embed_context = vec!();
// Same as push so we translate it to what it would be
let mut commands = Hash::new();
if let Ok(s) = get_key(map, "embed_scope", Some) {
commands.insert(Yaml::String("meta_content_scope".to_string()), s.clone());
embed_context.push(Yaml::Hash(commands));
commands = Hash::new();
}
commands.insert(Yaml::String("include".to_string()), y.clone());
embed_context.push(Yaml::Hash(commands));
let embedded_context = SyntaxDefinition::parse_context(
&embed_context,
state,
false
)?;

if let Ok(v) = get_key(map, "escape", Some) {
let mut match_map = Hash::new();
match_map.insert(Yaml::String("match".to_string()), v.clone());
match_map.insert(Yaml::String("pop".to_string()), Yaml::Boolean(true));
if let Ok(y) = get_key(map, "escape_captures", Some) {
match_map.insert(Yaml::String("captures".to_string()), y.clone());
}
let escape_context = SyntaxDefinition::parse_context(
&[Yaml::Hash(match_map)],
state,
false
)?;
MatchOperation::Push(vec![ContextReference::Inline(escape_context), ContextReference::Inline(embedded_context)])
} else {
return Err(ParseSyntaxError::MissingMandatoryKey("escape"));
}

} else {
MatchOperation::None
};

let with_prototype = if let Ok(v) = get_key(map, "with_prototype", |x| x.as_vec()) {
// should a with_prototype include the prototype? I don't think so.
Some(SyntaxDefinition::parse_context(v, state, true)?)
Some(Self::parse_context(v, state, true)?)
} else if let Ok(v) = get_key(map, "escape", Some) {
let mut context = Context::new(false);
let mut match_map = Hash::new();
match_map.insert(Yaml::String("match".to_string()), Yaml::String(format!("(?={})", v.as_str().unwrap())));
match_map.insert(Yaml::String("pop".to_string()), Yaml::Boolean(true));
let pattern = SyntaxDefinition::parse_match_pattern(&match_map, state)?;
if pattern.has_captures {
context.uses_backrefs = true;
}
context.patterns.push(Pattern::Match(pattern));

Some(Rc::new(RefCell::new(context)))
} else {
None
};
Expand All @@ -307,14 +347,15 @@ impl SyntaxDefinition {
operation: operation,
with_prototype: with_prototype,
};

Ok(pattern)
}

fn parse_pushargs(y: &Yaml,
state: &mut ParserState)
-> Result<Vec<ContextReference>, ParseSyntaxError> {
// check for a push of multiple items
if y.as_vec().map_or(false, |v| !v.is_empty() && v[0].as_str().is_some()) {
if y.as_vec().map_or(false, |v| !v.is_empty() && (v[0].as_str().is_some() || (v[0].as_vec().is_some() && v[0].as_vec().unwrap()[0].as_hash().is_some()))) {
// this works because Result implements FromIterator to handle the errors
y.as_vec()
.unwrap()
Expand Down Expand Up @@ -621,6 +662,117 @@ mod tests {
}
}

#[test]
fn can_parse_embed_as_with_prototypes() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this test. Very clearly shows what the rewrite does.

let old_def = SyntaxDefinition::load_from_str(r#"
name: C
scope: source.c
file_extensions: [c, h]
variables:
ident: '[QY]+'
contexts:
main:
- match: '(>)\s*'
captures:
1: meta.tag.style.begin.html punctuation.definition.tag.end.html
push:
- [{ match: '(?i)(?=</style)', pop: true }]
- [{ meta_content_scope: 'source.css.embedded.html'}, { include: 'scope:source.css' }]
with_prototype:
- match: (?=(?i)(?=</style))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The double-lookahead does exactly the same as the single lookahead right? It's just that the escape regex might not be a lookahead and in order to emulate it correctly we have to ensure this one is a lookahead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct 👍

pop: true
"#,false).unwrap();

let def_with_embed = SyntaxDefinition::load_from_str(r#"
name: C
scope: source.c
file_extensions: [c, h]
variables:
ident: '[QY]+'
contexts:
main:
- match: '(>)\s*'
captures:
1: meta.tag.style.begin.html punctuation.definition.tag.end.html
embed: scope:source.css
embed_scope: source.css.embedded.html
escape: (?i)(?=</style)
"#,false).unwrap();

assert_eq!(old_def.contexts["main"], def_with_embed.contexts["main"]);
}

#[test]
fn errors_on_embed_without_escape() {
let def = SyntaxDefinition::load_from_str(r#"
name: C
scope: source.c
file_extensions: [c, h]
variables:
ident: '[QY]+'
contexts:
main:
- match: '(>)\s*'
captures:
1: meta.tag.style.begin.html punctuation.definition.tag.end.html
embed: scope:source.css
embed_scope: source.css.embedded.html
"#,false);
assert!(def.is_err());
match def.unwrap_err() {
ParseSyntaxError::MissingMandatoryKey(key) => assert_eq!(key, "escape"),
_ => assert!(false, "Got unexpected ParseSyntaxError"),
}
}

#[test]
fn can_parse_ugly_yaml() {
let defn: SyntaxDefinition =
SyntaxDefinition::load_from_str("
name: LaTeX
scope: text.tex.latex
contexts:
main:
- match: '((\\\\)(?:framebox|makebox))\\b'
captures:
1: support.function.box.latex
2: punctuation.definition.backslash.latex
push:
- [{meta_scope: meta.function.box.latex}, {match: '', pop: true}]
- argument
- optional-arguments
argument:
- match: '\\{'
scope: punctuation.definition.group.brace.begin.latex
- match: '(?=\\S)'
pop: true
optional-arguments:
- match: '(?=\\S)'
pop: true
",
false)
.unwrap();
assert_eq!(defn.name, "LaTeX");
let top_level_scope = Scope::new("text.tex.latex").unwrap();
assert_eq!(defn.scope, top_level_scope);

let first_pattern: &Pattern = &defn.contexts["main"].borrow().patterns[0];
match first_pattern {
&Pattern::Match(ref match_pat) => {
let m: &CaptureMapping = match_pat.captures.as_ref().expect("test failed");
assert_eq!(&m[0], &(1,vec![Scope::new("support.function.box.latex").unwrap()]));

//use parsing::syntax_definition::ContextReference::*;
// TODO: check the first pushed reference is Inline(...) and has a meta_scope of meta.function.box.latex
// TODO: check the second pushed reference is Named("argument".to_owned())
// TODO: check the third pushed reference is Named("optional-arguments".to_owned())

assert!(match_pat.with_prototype.is_none());
}
_ => assert!(false),
}
}

#[test]
fn can_rewrite_regex() {
fn rewrite(s: &str) -> String {
Expand Down
2 changes: 1 addition & 1 deletion testdata/Packages
Submodule Packages updated 96 files
+17 −5 ASP/ASP.sublime-syntax
+36 −33 ASP/syntax_test_asp.asp
+14 −2 Batch File/Batch File.sublime-syntax
+10 −4 Batch File/syntax_test_batch_file.bat
+84 −49 C#/C#.sublime-syntax
+264 −0 C#/tests/syntax_test_C#7.cs
+31 −1 C#/tests/syntax_test_GeneralStructure.cs
+1 −1 C#/tests/syntax_test_Generics.cs
+42 −6 C#/tests/syntax_test_Using.cs
+38 −0 C++/C Standard Includes.sublime-completions
+218 −0 C++/C++ Standard Includes.sublime-completions
+126 −38 C++/C++.sublime-syntax
+33 −5 C++/C.sublime-syntax
+41 −0 C++/Default.sublime-keymap
+52 −1 C++/syntax_test_c.c
+245 −5 C++/syntax_test_cpp.cpp
+2 −2 CSS/CSS.sublime-syntax
+5 −1 CSS/syntax_test_css.css
+172 −209 Erlang/Erlang.sublime-syntax
+72 −0 Erlang/syntax_test_erlang.erl
+8 −2 Go/Go.sublime-syntax
+7 −4 Go/syntax_test_go.go
+33 −0 Graphviz/Comments.tmPreferences
+75 −8 Graphviz/DOT.sublime-syntax
+23 −0 Graphviz/Indentation Rules.tmPreferences
+18 −0 Graphviz/Symbol List.tmPreferences
+44 −1 Graphviz/syntax_test_dot.dot
+5 −5 Groovy/Groovy.sublime-syntax
+49 −0 Groovy/syntax_test_groovy.groovy
+32 −73 HTML/HTML.sublime-syntax
+2 −8 HTML/Miscellaneous.tmPreferences
+2 −2 HTML/html_completions.py
+4 −1 HTML/syntax_test_html.html
+14 −23 Java/Java Server Pages (JSP).sublime-syntax
+224 −0 Java/Java.sublime-completions
+472 −382 Java/Java.sublime-syntax
+17 −0 Java/Symbol List%3A Constants.tmPreferences
+1,028 −76 Java/syntax_test_java.java
+2 −2 Java/syntax_test_jsp.jsp
+1 −0 JavaScript/JSON.sublime-syntax
+729 −688 JavaScript/JavaScript.sublime-syntax
+119 −54 JavaScript/syntax_test_js.js
+283 −511 LaTeX/LaTeX.sublime-syntax
+12 −0 LaTeX/syntax_test_latex.tex
+1 −0 Lisp/Lisp.sublime-syntax
+19 −0 Makefile/Comments.tmPreferences
+1 −1 Makefile/Makefile.sublime-settings
+590 −144 Makefile/Makefile.sublime-syntax
+4 −13 Makefile/Miscellaneous.tmPreferences
+830 −32 Makefile/syntax_test_makefile.mak
+110 −16 Markdown/Markdown.sublime-syntax
+18 −0 Markdown/Symbol List - Reference Link.tmPreferences
+367 −76 Markdown/syntax_test_markdown.md
+123 −101 Matlab/Matlab.sublime-syntax
+5 −2 Matlab/Miscellaneous.tmPreferences
+114 −15 Matlab/syntax_test_matlab.m
+41 −0 Objective-C/Default.sublime-keymap
+153 −56 Objective-C/Objective-C++.sublime-syntax
+57 −18 Objective-C/Objective-C.sublime-syntax
+255 −10 Objective-C/syntax_test_objc++.mm
+58 −0 Objective-C/syntax_test_objc.m
+16 −0 PHP/Indentation Rules - heredoc end.tmPreferences
+23 −5 PHP/Indentation Rules.tmPreferences
+126 −43 PHP/PHP Source.sublime-syntax
+1 −1 PHP/PHP.sublime-completions
+6 −2 PHP/PHP.sublime-syntax
+98 −7 PHP/syntax_test_php.php
+183 −116 Python/Python.sublime-syntax
+1 −1 Python/Snippets/New-Property.sublime-snippet
+141 −3 Python/syntax_test_python.py
+18 −6 Python/syntax_test_python_strings.py
+1 −0 R/R.sublime-syntax
+7 −0 R/syntax_test_r.R
+1 −1 RestructuredText/reStructuredText.sublime-syntax
+62 −0 RestructuredText/syntax_test_restructuredtext.rst
+1 −1 Ruby/Miscellaneous.tmPreferences
+2 −2 Ruby/Ruby.sublime-syntax
+11 −0 Ruby/syntax_test_ruby.rb
+57 −26 Scala/Scala.sublime-syntax
+91 −18 Scala/syntax_test_scala.scala
+1,347 −0 ShellScript/Bash.sublime-syntax
+10 −2 ShellScript/Indentation.tmPreferences
+15 −0 ShellScript/Makefile
+3 −562 ShellScript/Shell-Unix-Generic.sublime-syntax
+44 −0 ShellScript/ShellScript.py
+4 −0 ShellScript/ShellScript.sublime-build
+1 −1 ShellScript/Snippets/#!-usr-bin-env-(!env).sublime-snippet
+18 −0 ShellScript/Symbol List - Aliases.tmPreferences
+18 −0 ShellScript/Symbol List - Expansions.tmPreferences
+18 −0 ShellScript/Symbol List - Functions.tmPreferences
+18 −0 ShellScript/Symbol List - Variables.tmPreferences
+1,223 −0 ShellScript/commands-builtin-shell-bash.sublime-syntax
+155 −0 ShellScript/commands-builtin-shell-bash.yml
+2,112 −0 ShellScript/test/syntax_test_bash.sh
+150 −0 ShellScript/tools/update-commands.py
+1 −1 XML/XML.sublime-settings
33 changes: 33 additions & 0 deletions testdata/embed_escape_test.sublime-syntax
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
%YAML 1.2
---
name: Embed_Escape Used by tests in src/parsing/parser.rs
scope: source.embed-test
contexts:
main:
- match: (")
scope: meta.attribute-with-value.style.html string.quoted.double punctuation.definition.string.begin.html
embed: embedded_context
embed_scope: meta.attribute-with-value.style.html source.css
escape: '\1'
escape_captures:
0: meta.attribute-with-value.style.html string.quoted.double punctuation.definition.string.end.html
- match: '(>)\s*'
captures:
1: meta.tag.style.begin.html punctuation.definition.tag.end.html
embed: embedded_context
embed_scope: source.css.embedded.html
escape: (?i)(?=</style)
- match: '</style>'
- match: foobar
scope: top-level.test

embedded_context:
- match: a
scope: a
push: # prove that multiple context levels can be "escape"d
- match: b
push:
- match: c
push:
- match: 'test'
scope: test.embedded
2 changes: 1 addition & 1 deletion testdata/minimized_tests/syntax_test_aspmini.asp
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
<%
Class TestClass2 Public Sub TestSub () Response.Write("wow") End Sub End Class
'^^^^^ meta.class.asp meta.class.identifier.asp storage.type.asp
' ^ meta.class.asp meta.class.body.asp meta.class.asp meta.class.identifier.asp
' ^ meta.class.asp meta.class.identifier.asp
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I copied this from the Sublime syntax test, did the Sublime one change?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

' ^ meta.class.asp meta.class.body.asp
%>
<p>foobar</p>
Expand Down