Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(css_parser,css_formatter): start parsing exact properties and values #1419

Merged
merged 1 commit into from
Jan 5, 2024

Conversation

faultyserver
Copy link
Contributor

@faultyserver faultyserver commented Jan 4, 2024

Summary

Closes #1445. Related to #268.

This PR implements the infrastructure needed to parse properties and their values exactly. With this, we can represent every property defined in CSS using an exact structure that perfectly matches the grammar definition for that property, exposing all of the known information directly in the syntax tree, rather than having to compute rules about value structures elsewhere, like in an analyzer.

The primary complication is that the CSS spec defines explicit grammars for each property, saying what the "normal" valid values are, but then also includes all of the CSS-wide keywords as valid values for a property, so long as they are the only thing in the value. On top of that, we also want to be able to represent "incorrect" values as syntactically correct, rather than just immediately falling back to Bogus. This will allow other tools to trust the syntax of the value and attempt to interpret it more easily with structured values rather than just a stream of bare tokens.

To handle these fallback cases, this implementation uses the new rewind support from #1417 to attempt parsing a value and then re-parse it entirely if the value is incorrect. It also checks to see if the value definition is fully consumed before "approving" the value to ensure that extraneous content in the value is also handled, and it checks for CSS-wide keyword values as well, finally falling back to Bogus if no other pattern matches, meaning there truly must be a syntax error in the value definition.

All of this behavior is abstracted away into a parse_property_value_with_fallbacks function, which lets you provide a parsing function to try to parse the correct value from the input, then handles rolling back and re-parsing if it fails. The result is a really clean implementation for most properties that can just focus on the actual value grammar without needing to handle error cases directly:

pub(crate) fn parse_z_index_property(p: &mut CssParser) -> ParsedSyntax {
    // Assumes the parent has confirmed we're at the `z-index` identifier.
    let m = p.start();
    parse_regular_identifier(p).ok();
    p.expect(T![:]);

    parse_property_value_with_fallbacks(p, |p| {
        parse_css_auto(p).or_else(|| parse_regular_number(p))
    })
    .ok();

    Present(m.complete(p, CSS_Z_INDEX_PROPERTY))
}

Additionally, the parse_any_property function, which is the entry point to parsing all property declarations, now uses a match branch to dispatch to the appropriate property parser based on the name of the property:

pub(crate) fn parse_any_property(p: &mut CssParser) -> ParsedSyntax {
    if !is_at_generic_property(p) {
        return Absent;
    }

    match p.cur_text() {
        "all" => parse_all_property(p),
        "z-index" => parse_z_index_property(p),
        _ => parse_generic_property(p),
    }
}

Using a branch here provides a guarantee that the parser is at the expected identifier name before entering the property parser. This is more efficient than trying to parse each type of property in order, especially since there will be hundreds of possible properties once the parser is complete and checking each one will take a non-negligible amount of time on average for every declaration in the source content. The branch also ensures that we don't accidentally try to parse a value as some other kind of value, since the name is checked before dispatching at all.

I also added a bunch of comments in the css.ungram explaining how properties should be defined, to ensure that the appropriate code is auto-generated, and to hopefully ensure we can do other things that the CSS grammar defines, like embedding other property grammars within another (e.g., the font shorthand explicitly includes <'font-size'>, where the quotes mean "any valid value for the font-size property except for the global keywords", rather than just a font-size token. There may be some additional work needed to ensure we don't accidentally accept global keywords or fallbacks, but that will happen when we come to those properties.

I'm hopeful that all of this infrastructure will make it really simple and efficient to implement all of the hundreds of unique properties that CSS defines.

Test Plan

Added a bunch of spec tests to cover all of the variations here: valid properties (both all and z-index), known properties with unknown values, properties with bogus values, custom properties, and unknown properties, all of which should be covered in the snapshots.

I think for parser tests we should follow a very similar format for all of the property values: cover the css-wide keywords, a consistent set of "generally incorrect" values that fall back to the CssUnknownPropertyValue, and then all of the actually valid types, ordered by how they're defined in the grammar. The result is pretty similar to what MDN shows for their "Examples" section, and for the most part could just be copy-pasted into tests as a starting place, which is nice.

Copy link

netlify bot commented Jan 4, 2024

Deploy Preview for biomejs canceled.

Name Link
🔨 Latest commit fc63830
🔍 Latest deploy log https://app.netlify.com/sites/biomejs/deploys/65987be80881390008a0e339

@github-actions github-actions bot added A-Parser Area: parser A-Formatter Area: formatter A-Tooling Area: internal tools L-CSS Language: CSS labels Jan 4, 2024
Copy link
Contributor

github-actions bot commented Jan 4, 2024

Parser conformance results on

js/262

Test result main count This PR count Difference
Total 49701 49701 0
Passed 48721 48721 0
Failed 980 980 0
Panics 0 0 0
Coverage 98.03% 98.03% 0.00%

jsx/babel

Test result main count This PR count Difference
Total 40 40 0
Passed 37 37 0
Failed 3 3 0
Panics 0 0 0
Coverage 92.50% 92.50% 0.00%

symbols/microsoft

Test result main count This PR count Difference
Total 6322 6322 0
Passed 2036 2036 0
Failed 4286 4286 0
Panics 0 0 0
Coverage 32.20% 32.20% 0.00%

ts/babel

Test result main count This PR count Difference
Total 662 662 0
Passed 592 592 0
Failed 70 70 0
Panics 0 0 0
Coverage 89.43% 89.43% 0.00%

ts/microsoft

Test result main count This PR count Difference
Total 17646 17646 0
Passed 13452 13452 0
Failed 4192 4192 0
Panics 2 2 0
Coverage 76.23% 76.23% 0.00%

p.expect(T![:]);

parse_property_value_with_fallbacks(p, |p| {
parse_css_auto(p).or_else(|| parse_regular_number(p))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This technically accepts any number right now, but i want to try to keep this PR "small", and we can add in specific integer/range-bound handling later on.

Copy link

codspeed-hq bot commented Jan 4, 2024

CodSpeed Performance Report

Merging #1419 will improve performances by 30.53%

Comparing faulty/css-property-parsing (fc63830) with main (bd88d8f)

Summary

⚡ 1 improvements
✅ 92 untouched benchmarks

Benchmarks breakdown

Benchmark main faulty/css-property-parsing Change
big5-added.json[uncached] 3.7 ms 2.9 ms +30.53%

Copy link
Contributor

@denbezrukov denbezrukov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the solution of handling properties with fallback!

@@ -463,6 +594,11 @@ CssGenericProperty =
// background: transparent center/1em auto no-repeat;
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// }
//
// This node type is implicitly added to any Node definition with the name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that this node type is added to any value Node definition?

What do you think about having a fallback for the entire property node to CssGenericProperty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having the fallbacks available on each specific property is nice because it ensures you can always query for a property purely by the struct in the AST. Like every declaration of width, whether it's valid, invalid, a default, or even has a bogus, syntactically incorrect value, will always be a CssWidthProperty node, and you can be confident that you won't miss any when traversing the tree. So it'd be like:

width: 10px; /* CssWidthProperty, with CssRegularDimension as the value */
width: 10s; /* CssWidthProperty, with CssGenericComponentValueList as a value */
width: 10(\; /* CssWidthProperty, with BogusPropertyValue as a value */
unknown: 10s; /* CssGenericProperty, with CssGenericComponentValueList as a value */
--custom: 10s; /* CssGenericProperty, with CssGenericComponentValueList as a value */

I know we had CssCustomProperty for a little while that handle the --custom ones, and maybe that's worth bringing back to indicate that it's semantically valid and understood, separate from CssGenericProperty that would generally mean "this looks valid, but we don't know what the property is, so we can't enforce any semantics".

Unfortunately, either way if we do just fall back to generic for invalid values, we probably still want to do the automatic code generation stuff with the union to add CssWideKeyword as a value, since that applies to all properties, too. I think that's why I'm really more okay with having this be another variant on every property type, since there will already be at least 2 or 3 variants no matter what.

@faultyserver faultyserver force-pushed the faulty/css-property-parsing branch from 0d3a8b0 to 0773ac5 Compare January 5, 2024 20:33
wip parsing properties

properly parse and format `all` and `z-index`

fix comment in ungram

feedback

clippy

more clippy

doctest
@faultyserver faultyserver force-pushed the faulty/css-property-parsing branch from 8c21e46 to fc63830 Compare January 5, 2024 22:00
Copy link
Contributor

@denbezrukov denbezrukov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm so excited to see the new nodes in action.

@faultyserver faultyserver merged commit 86688e4 into main Jan 5, 2024
20 checks passed
@faultyserver faultyserver deleted the faulty/css-property-parsing branch January 5, 2024 22:37
@Conaclos Conaclos added the A-Changelog Area: changelog label Jan 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Changelog Area: changelog A-Formatter Area: formatter A-Parser Area: parser A-Tooling Area: internal tools L-CSS Language: CSS
Projects
None yet
Development

Successfully merging this pull request may close these issues.

📎 CSS Parser: Parse all and z-index properties
3 participants