-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] Hand-written MDTest parser #15926
Conversation
crates/red_knot_test/src/parser.rs
Outdated
if code.ends_with('\n') { | ||
code = &code[..code.len() - 1]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is something that we could remove. It seems strange that we trim trailing newlines of code blocks. But I didn't want to change the behavior in this PR.
1aee7ac
to
38321c9
Compare
crates/red_knot_test/src/parser.rs
Outdated
self.cursor.bump(); | ||
|
||
if self.preceding_blank_lines < 1 && self.explicit_path.is_none() { | ||
bail!("Code blocks must start on a new line and be preceded by at least one blank line."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This message could be improved/adjusted according to the logic in the if
condition, but I didn't want to introduce and behavioral changes here.
38321c9
to
987ea75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. No more regex :)
|
@@ -92,6 +92,18 @@ impl<'a> Cursor<'a> { | |||
} | |||
} | |||
|
|||
/// Eats the next two characters if they are `c1` and `c2`. Does not | |||
/// consume any input otherwise, even if the first character matches. | |||
pub fn eat_char2(&mut self, c1: char, c2: char) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I wasn't aware that eat_char2
is only available on the ruff_python_parser Cursor
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh. And I didn't notice that eat_char2
was available somewhere. My implementation is slightly different:
pub fn eat_char2(&mut self, c1: char, c2: char) -> bool {
if self.first() == c1 && self.second() == c2 {
self.bump();
self.bump();
true
} else {
false
}
}
vs the existing:
pub(super) fn eat_char2(&mut self, c1: char, c2: char) -> bool {
let mut chars = self.chars.clone();
if chars.next() == Some(c1) && chars.next() == Some(c2) {
self.bump();
self.bump();
true
} else {
false
}
}
Should I do anything about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine. The second might be slightly faster but then, it's used rarely.
…omment test (#15933) ## Summary I experimented with [not trimming trailing newlines in code snippets](#15926 (comment)), but since came to the conclusion that the current behavior is better because otherwise, there is no way to write snippets without a trailing newline at all. And when you copy the code from a Markdown snippet in GitHub, you also don't get a trailing newline. I was surprised to see some test failures when I played with this though, and decided to make this test independent from this implementation detail.
* main: (66 commits) [red-knot] Use ternary decision diagrams (TDDs) for visibility constraints (#15861) [`pyupgrade`] Rename private type parameters in PEP 695 generics (`UP049`) (#15862) Simplify the `StringFlags` trait (#15944) [`flake8-pyi`] Make `PYI019` autofixable for `.py` files in preview mode as well as stubs (#15889) Docs (`linter.md`): clarify that Python files are always searched for in subdirectories (#15882) [`flake8-pyi`] Make PEP-695 functions with multiple type parameters fixable by PYI019 again (#15938) [red-knot] Use unambiguous invalid-syntax-construct for suppression comment test (#15933) Make `Binding::range()` point to the range of a type parameter's name, not the full type parameter (#15935) Update black deviations (#15928) [red-knot] MDTest: Fix line numbers in error messages (#15932) Preserve triple quotes and prefixes for strings (#15818) [red-knot] Hand-written MDTest parser (#15926) [`pylint`] Fix missing parens in unsafe fix for `unnecessary-dunder-call` (`PLC2801`) (#15762) nit: docs for ignore & select (#15883) [airflow] `BashOperator` has been moved to `airflow.providers.standard.operators.bash.BashOperator` (AIR302) (#15922) [`flake8-logging`] `.exception()` and `exc_info=` outside exception handlers (`LOG004`, `LOG014`) (#15799) [red-knot] Enforce specifying paths for mdtest code blocks in a separate preceding line (#15890) [red-knot] Internal refactoring of visibility constraints API (#15913) [red-knot] Implicit instance attributes (#15811) [`flake8-comprehensions`] Handle extraneous parentheses around list comprehension (`C403`) (#15877) ...
Summary
Replaces our existing Markdown test parser with a fully hand-written parser. I tried to fix this bug using the old approach and kept running into problems. Eventually this seemed like the easier way. It's more code (+50 lines, excluding the new test), but I hope it's relatively straightforward to understand, compared to the complex interplay between the byte-stream-manipulation and regex-parsing that we had before.
I did not really focus on performance, as the parsing time does not dominate the test execution time, but this seems to be slightly faster than what we had before (executing all MD tests; debug):
main
closes #15923
Test Plan
One new regression test.