Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regular_expression: implement visitor pattern for regex AST #5977

Closed
camchenry opened this issue Sep 22, 2024 · 0 comments · Fixed by #6055
Closed

regular_expression: implement visitor pattern for regex AST #5977

camchenry opened this issue Sep 22, 2024 · 0 comments · Fixed by #6055
Assignees
Labels
C-enhancement Category - New feature or request

Comments

@camchenry
Copy link
Contributor

With the regex parser completed, we have now started using oxc_regular_expression in the linter for parsing patterns instead of handwritten pattern matching and parsing code.

However, it is difficult to traverse the regex AST and visit each node (term/disjunction/alternative/etc.). We should implement the visitor pattern for the regex AST, so it is easier to write some code that visits every node.

Ideally, we should be able to replace code like this:

visit_terms(pattern, &mut |term| {
if let Term::Quantifier(_) = term {
in_quantifier = true;
return;
}
let Term::Character(ch) = term else {
return;
};
if in_quantifier {
in_quantifier = false;
return;
}
if ch.value != u32::from(b' ') {
return;
}
if let Some(ref mut space_span) = last_space_span {
// If this is consecutive with the last space, extend it
if space_span.end == ch.span.start {
space_span.end = ch.span.end;
}
// If it is not consecutive, and the last space is only one space, move it up
else if space_span.size() == 1 {
last_space_span.replace(ch.span);
}
} else {
last_space_span = Some(ch.span);
}
});

/// Calls the given closure on every [`Term`] in the [`Pattern`].
fn visit_terms<'a, F: FnMut(&'a Term<'a>)>(pattern: &'a Pattern, f: &mut F) {
visit_terms_disjunction(&pattern.body, f);
}
/// Calls the given closure on every [`Term`] in the [`Disjunction`].
fn visit_terms_disjunction<'a, F: FnMut(&'a Term<'a>)>(disjunction: &'a Disjunction, f: &mut F) {
for alternative in &disjunction.body {
visit_terms_alternative(alternative, f);
}
}
/// Calls the given closure on every [`Term`] in the [`Alternative`].
fn visit_terms_alternative<'a, F: FnMut(&'a Term<'a>)>(alternative: &'a Alternative, f: &mut F) {
for term in &alternative.body {
match term {
Term::LookAroundAssertion(lookaround) => {
f(term);
visit_terms_disjunction(&lookaround.body, f);
}
Term::Quantifier(quant) => {
f(term);
f(&quant.body);
}
Term::CapturingGroup(group) => {
f(term);
visit_terms_disjunction(&group.body, f);
}
Term::IgnoreGroup(group) => {
f(term);
visit_terms_disjunction(&group.body, f);
}
_ => f(term),
}
}
}

@camchenry camchenry added the C-enhancement Category - New feature or request label Sep 22, 2024
@Boshen Boshen changed the title implement visitor pattern for regex AST regular_expression: implement visitor pattern for regex AST Sep 23, 2024
@camchenry camchenry self-assigned this Sep 25, 2024
Boshen pushed a commit that referenced this issue Sep 26, 2024
…ST (#6055)

- resolves #5977
- supersedes #5951

To facilitate easier traversal of the Regex AST, this PR defines a `Visit` trait with default implementations that will walk the entirety of the Regex AST. Methods in the `Visit` trait can be overridden with custom implementations to do things like analyzing only certain nodes in a regular expression, which will be useful for regex-related `oxc_linter` rules.

In the future, we should consider automatically generating this code as it is very repetitive, but for now a handwritten visitor is sufficient.
@Boshen Boshen closed this as completed Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category - New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants