Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add range to nodes and fix whitespace issue (fixes #137) #138

Merged
merged 3 commits into from
Jul 13, 2021

Conversation

nonara
Copy link
Collaborator

@nonara nonara commented Jul 12, 2021

Abstract

It turns out that other AST parsers do create a TextNode for all whitespace (see https://astexplorer.net) The only which does not is the parser for angular templates, which it likely does because it's used in a very narrow context.

The issue ended up being that node-html-parser actually does create a TextNode for whitespace, but there were some cases in which that was failing to happen. This caused the behaviour to differ from other parsers and is the root of the issue laid out in #137.

Features / Fixes

Code Changes

  • Cleaned up parser code to make it understandable and slightly more efficient
  • Added comprehensive test for node ranges
  • Added basic whitespace parser test

Version

v4.1.0

Because a feature was added, I recommend bumping minor version.

Related

@@ -1012,88 +1018,109 @@ export function base_parse(data: string, options = { lowerCaseTagName: false, co
return it.test(tag);
});
}
const root = new HTMLElement(null, {}, '', null);
Copy link
Collaborator Author

@nonara nonara Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parser: The logic is the same, albeit cleaned up a bit. I also fixed the whitespace issue, added some comments, and used labels which made it more obvious what's happening. Some efficiency should be gained as well.

writable: true,
configurable: true,
value: range ?? [ -1, -1 ]
});
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We define as non-enumerable, to preserve existing functionality for equality tests between a synthetic node (one created manually) and a parsed node (one created during parse()). It's unlikely that this has a broad impact scope outside of this library's own tests, but it seems a good idea to make it non-enumerable anyway. It also allows us to not have to change any of our existing tests to exclude range property from equality testing.

@@ -0,0 +1,85 @@
const { parse, HTMLElement, TextNode, CommentNode } = require('../dist');
Copy link
Collaborator Author

@nonara nonara Jul 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test parses with htmlparser2 and node-html-parser and compares the output ranges and nodes. Although it serves as a test for range comparison, it will also be helpful in catching a broad range of other issues with the parser that other tests may not cover. This is because it ensures the AST structures are the same for all fundamental node types (blockText, comment, self-closing, etc) parsed by the parser.

@nonara
Copy link
Collaborator Author

nonara commented Jul 13, 2021

@taoqf Ready for review!

@taoqf taoqf merged commit f052ba3 into taoqf:main Jul 13, 2021
@taoqf
Copy link
Owner

taoqf commented Jul 13, 2021

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants