Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconcile difference between dynamic tags, literal tags, and manual creation #1269

Open
siku2 opened this issue May 28, 2020 · 12 comments
Open
Labels
A-yew Area: The main yew crate A-yew-macro Area: The yew-macro crate proposal

Comments

@siku2
Copy link
Member

siku2 commented May 28, 2020

Problem

With the introduction of dynamic tags in #1266 there are now three different ways of constructing HTML tags:

  1. Using literal tags in the html! macro (<span></span>)
  2. Dynamic tags in the macro (<@{"span"}></@>)
  3. Creating VTag instances manually (VTag::new("span"))

All three of these have different restrictions with the third option (manually constructing) having virtually none at all.
Dynamic tags try to have the same restrictions as literal ones using runtime checks but there are still a few differences:

  • dynamic tags are always converted to lower case. While <sPAN></sPAN> and <@{"sPAN"}></@> produce equivalent HTML, Yew currently stores the former as "sPAN" internally and the latter as "span".
  • literal tags must be valid rust identifiers with the exception that they may contain hyphens (-) anywhere in the middle. Dynamic tags can use any ASCII character at any position.
  • void elements must be self closing using literal tags <br/> but dynamic tags can do <@{"br"}></@>. I think this one is actually a good thing because it gives dynamic tags more flexibility without any downside.

There's another issue that currently plagues both macro methods. They only allow tag names containing characters in the ASCII range. This is too broad for normal tag names which only allow for alphanumerics[1] but it's also too narrow for custom elements which support a big range of characters outside of the ASCII range[2].

Somewhat related to this issue is that Yew currently preserves the casing for tag names. This can easily lead to bugs like this:

let vtag = VTag::new("dIV");
if vtag.tag == "div" {
  println!("it's a div!");
}

Proposal

All three methods should have exactly the same rules. For the literal case these rules should be enforced during compilation while the other two need runtime checks.

Valid Tag Names

The VTag constructor ensures the given tag name is valid.
It might be a good idea to use a wrapper type TagName internally. This type could be treated just like a normal string but it can only contain valid tag names. It would also make it easier to ignore casing when comparing names.

Void Element

Void elements must not have any children. To guard against this we store a allow_children flag in VTag which is set in the constructor by comparing the tag name against the list of void elements.
The add_child method is modified to check this flag first.

Special Attributes

The attribute value needs to be handled differently depending on the tag name.
This can be done by adding a check to the set_value method which forwards the value to the attributes instead if the tag isn't input or textarea.


In all these cases the literal tags in macros should already perform the checks during compilation. To avoid the performance penalty there should be "private" methods which avoid the checks entirely which can then be used by the macro.

There should also be equivalent methods that return a Result instead of panicking.
This gives us a structure like this:

// used by normal users
fn do() -> T {
  try_do().expect("appropriate message")
}

// can be used by libraries and cases where an error is expected
fn try_do() -> Result<T, E> {
  // perform the various checks

  Ok(unchecked_do())
}

// used by macro
fn unchecked_do() -> T {
  // do the thing
}
@philip-peterson
Copy link
Contributor

Per the discussion we had on #1344 , I would question why we need dynamic strings as tag values at all. For HTML, MathML, and SVG elements, these should be easy to represent as some built-in Yew enum which could be represented as a &'static str in code but converted to an enum value through macros.

In the case of custom elements, maybe that could be declared via some configuration, possibly in Cargo.toml or just in Yew config, stored as a type inside App or in some other Yew config location. That way we can avoid checking every tag name is valid at runtime as well as needing to bundle a list of valid tags in the main bundle.

@philip-peterson
Copy link
Contributor

philip-peterson commented Jun 28, 2020

That said, I could see a use case for this type of pattern as well which could be good to keep in mind when we are implementing a solution for this Issue:

let z = if some_condition {
  Box
} else {
  Circle
};

return <@{z}></@>

@philip-peterson
Copy link
Contributor

philip-peterson commented Jun 28, 2020

After doing some more research, a couple of modifications to my stance:

  • The branching on types I suggested in the above comment seems unwise / out of scope, since it would be moving type resolution to runtime, escaping Rust's type safety. A generic could achieve the same result.
  • Sorry, I didn't realize dynamic tags are basically brand-new. I can understand the want for them if regular HTML-style tag instantiation isn't working for custom tags with non-ASCII names, although still not sure about the differences long-term between those two styles. Was access to custom elements the main motivation for creating them?
  • Although WhatWG specifies that custom tags should be lowercase, browsers still let you reference them with uppercase versions. However, not all upper/lower case variants are defined the same. For instance, Chrome lets you define a custom tag called image-a and then when you instantiate it as <image-A />, it works. However, if you define it as image-ê and reference it as image-Ê, it doesn't work. So we must be mindful of the different notions Unicode has of capitalization and make sure we don't stray beyond actual browser behavior. Firefox, interestingly, does not however allow a tag named image-ê to be defined even.

One more observation- although SVG has some elements that are defined as having capital letters in them such as <linearGradient />, the browsers I tested (Chrome, Firefox, Safari) all respond to <lineargradient /> as well.

@siku2
Copy link
Member Author

siku2 commented Jun 28, 2020

Using an enum which contains all possible tag names isn't a terrible idea. We can use it in conjunction with a tag_name! macro that returns the appropriate variant from a string. Custom element names could be stored in a special variant:

enum TagName {
  Div,
  // ...
  Custom(String),
}

Using this dynamic tag names would no longer accept a String but instead a variant of TagName.

That way we can avoid checking every tag name is valid at runtime as well as needing to bundle a list of valid tags in the main bundle.

In most cases this doesn't apply anyway. The most common case is creating a tag name through the macro where the names are verified at compile-time. The only case where we need to check tag names at runtime is for dynamic tag names (be it through the macro or using VTag::new). As long as we have the ability to create dynamic tags there's no way to avoid a runtime check. Even if we have an enum we would have to implement From<String> on it.


Was access to custom elements the main motivation for creating them?

No, it sure is a nice side-effect but the primary reason for them is to create dynamic tag names. There are cases where using a match statement would be needlessly verbose. It also works for higher order components that allow you to pass the tag type.


To summarise: I'm not against the idea of using an enum for tag names. In fact I've thought about it a lot myself. However, I think we need to make a strong case as to why it's beneficial in the long run.
I feel like this is something @jstarry would need to weigh in on.


Although WhatWG specifies that custom tags should be lowercase, browsers still let you reference them with uppercase versions.

That's exactly what the standard dictates:

In the HTML syntax, tag names, even those for foreign elements, may be written with any mix of lower- and uppercase letters that, when converted to all-lowercase, matches the element's tag name; tag names are case-insensitive.


if you define it as image-ê and reference it as image-Ê, it doesn't work.

Custom element names aren't case-insensitive in the same way. They're not allowed to use ASCII upper alpha but they any other character is case-sensitive.

One more observation- although SVG has some elements that are defined as having capital letters in them

Although tag names are case-insensitive the standard sometimes defines a canonical representation. This isn't really relevant to us because the representation doesn't matter for the lookup.

@philip-peterson
Copy link
Contributor

Been thinking about this some more...

All three methods should have exactly the same rules

I'm not sure whether this is possible. It seems like macros are not able to parse arbitrary UTF8 sequences from files... which some custom tag names have (within restriction). That said, the capitalization issues are probably at least something we can fix.

Dynamic tags can use any ASCII character at any position.

This isn't quite true, dynamic tags have to begin with a non-hyphen. There is some question though about whether we could expand the definition of a dashed name to be more inclusive, although I suspect it would only be by a little, such as having a hyphen at the end perhaps.

We can use it in conjunction with a tag_name! macro that returns the appropriate variant from a string. Custom element names could be stored in a special variant

I assume by this it was meant that in the normal case, one would not have to use this macro?

I suppose we could have the HTML tag macro accept any of div, span, table, etc. which are actually Rust symbols imported from a prelude somewhere, corresponding to say yew::tags::{div, span, table}. And then, if someone wanted to specify a custom tag, they could import yew::tags::{Custom} and do

<Custom("my-custom-tag")>{ "foo" }</Custom("my-custom-tag")>

Assuming this works, would the tag_name! macro still be needed?

@siku2
Copy link
Member Author

siku2 commented Jul 7, 2020

@philip-peterson Just a quick clarification:

Dynamic tags can use any ASCII character at any position.

This isn't quite true, dynamic tags have to begin with a non-hyphen. There is some question though about whether we could expand the definition of a dashed name to be more inclusive, although I suspect it would only be by a little, such as having a hyphen at the end perhaps.

I was talking about how it's currently implemented, not how it should be. You can see the runtime check that is inserted here:

TagName::Expr(name) => {
let expr = &name.expr;
let vtag_name = Ident::new("__yew_vtag_name", expr.span());
// this way we get a nice error message (with the correct span) when the expression doesn't return a valid value
quote_spanned! {expr.span()=> {
let mut #vtag_name = ::std::borrow::Cow::<'static, str>::from(#expr);
if !#vtag_name.is_ascii() {
::std::panic!("a dynamic tag returned a tag name containing non ASCII characters: `{}`", #vtag_name);
}
// convert to lowercase because the runtime checks rely on it.
#vtag_name.to_mut().make_ascii_lowercase();
#vtag_name
}}
}
};


All three methods should have exactly the same rules

I'm not sure whether this is possible. It seems like macros are not able to parse arbitrary UTF8 sequences from files... which some custom tag names have (within restriction). That said, the capitalization issues are probably at least something we can fix.

You're right, I mentioned that literal tags need to consist of valid rust identifiers. If you try to use invalid characters then compilation fails even before the macro comes into play so there's really nothing we can do about this.
In these rare cases dynamic tag names should come in handy.


We can use it in conjunction with a tag_name! macro that returns the appropriate variant from a string. Custom element names could be stored in a special variant

I assume by this it was meant that in the normal case, one would not have to use this macro?

I suppose we could have the HTML tag macro accept any of div, span, table, etc. which are actually Rust symbols imported from a prelude somewhere, corresponding to say yew::tags::{div, span, table}. And then, if someone wanted to specify a custom tag, they could import yew::tags::{Custom} and do

<Custom("my-custom-tag")>{ "foo" }</Custom("my-custom-tag")>

Assuming this works, would the tag_name! macro still be needed?

The tag_name! macro I have in mind actually works to enhance the setup you mentioned.
Let's look at the syn crate. Syn has a Token! macro which is very similar to what I have in mind.
There is a module syn::token that contains various Rust tokens and keyword (very similar to your tags module). Instead of accessing the members of said module directly syn recommends you use the Token macro.
Token![static] then resolves to syn::token::Static.

Here's a few benefits of doing it this way:

  • No need to bring the tags into scope or memorise the struct name. We can just use the tags exactly like we would write them in HTML.
  • We can easily detect custom element names and verify their validity at compile time (Of course there's still the problem with unicode characters but this is easily solved by allowing literal strings in the macro).
  • Using the macro we can provide more detailed feedback to the developer.
  • We can integrate this macro directly into the html! macro. html! { <div/> } -> VTag::new(tag_name!(div)) -> VTag::new(yew::tags::Div).

@teymour-aldridge
Copy link
Contributor

teymour-aldridge commented Jul 7, 2020 via email

@siku2
Copy link
Member Author

siku2 commented Jul 7, 2020

@teymour-aldridge Just to clarify: I don't particularly care whether Yew supports dynamic tag names or not - I'm playing devil's advocate here. This discussion also isn't only about dynamic tag names. The same thing is possible with VTag::new but with even less restrictions.

Here's an example of something that dynamic tags make possible:
Let's say we have a typography component for headings which accepts the level (h1 - h6) as a prop.

This is what it looks like using dynamic tag names:

// ensure that 1 <= level <= 6
let h_tag = format!("h{}", level);
<@{h_tag}>{ contents }</@>

And here's the same without:

match level {
  1 => html! { <h1>{ contents }</h1> },
  2 => html! { <h2>{ contents }</h2> },
  3 => html! { <h3>{ contents }</h3> },
  4 => html! { <h4>{ contents }</h4> },
  5 => html! { <h5>{ contents }</h5> },
  6 => html! { <h6>{ contents }</h6> },
  _ => panic!("invalid header level"),
}

This isn't terribly bad but I'm doing it a lot of favours such as not using any attributes. Just imagine what it would look like if we wanted to add an onclick listener to this.
Dynamic tags are only really useful for higher order components but it's important that we support this because it makes it a lot easier to create component libraries.

Let's try to abstract this a bit more. The underlying problem here is that we're unable to propagate tag names. There's no way for our hypothetical component to accept h1 - h6 as a value without using an opaque representation like a string or a number representing the level.
In the html macro tag names work like types but outside of it these types don't exist; they're just strings. Dynamic tag names are a patch solution for the inability to pass tag names around transparently.

All of this could be solved if we adopted a solution where each tag name is assigned a type. It would give us the ability to transparently pass tag names to components with type checking.
Here's what the same component might look like with statically typed tag names:

use yew::tags::{H1, H2, H3, H4, H5, H6};

// this is just a marker trait to ensure that `Heading` is only used with heading tags.
trait HTag {}
impl HTag for H1 {}
impl HTag for H2 {}
impl HTag for H3 {}
impl HTag for H4 {}
impl HTag for H5 {}
impl HTag for H6 {}

struct Props<T> {
    tag: T,
    text: String,
}
struct Heading<H: HTag> {
    props: Props<H>,
}
impl<H: HTag> Component for Heading<H> {
    type Props = Props<H>;

    fn view(&self) -> Html {
        let Props { tag, text } = &self.props;
        html! {
            <@{tag}>{ text }</@>
        }
    }
}

Note that in this example the syntax for dynamic tag names no longer accepts strings, only members of the yew::tags module. How this is to be achieved is still up for discussion but either way, this accomplishes type safety and doesn't depend on any runtime checks.

@teymour-aldridge
Copy link
Contributor

teymour-aldridge commented Jul 7, 2020 via email

@philip-peterson
Copy link
Contributor

philip-peterson commented Jul 7, 2020

I agree with what you say @siku2 , having the macro would be useful for construction using VTag::new. If we do have a tags module though it would be best IMO to have the tag names lowercased, because then when you see an title cased symbol you’ll know it’s a type / component.

For the example provided with the dynamic tag name, would that value passed into @{} not be a string currently? If in the new world, literal tag names are types, you could just construct a tag using the literal syntax but also the generic param, so it could be like <H>{ “foo” }</H> and you wouldn’t need a dynamic tag name or the prop either. Dynamic tag construction could still be useful for H{1,2,...,6} construction though, as you say, or using a prop to determine what kind of element will be the root of a component’s return value.

@mc1098 mc1098 added A-yew Area: The main yew crate A-yew-macro Area: The yew-macro crate labels Sep 20, 2021
@DasBlackfur
Copy link

One more observation- although SVG has some elements that are defined as having capital letters in them such as , the browsers I tested (Chrome, Firefox, Safari) all respond to as well.

This might be a recent regression in browsers but it seems that this no longer holds true.

@WorldSEnder
Copy link
Member

I have a somewhat different view on treating browser-behavior: We shouldn't try to get in the way, except for debug_assertions and compile time checks. When running in release mode, all bets can be off. At this point, we shouldn't insert a check if a dynamic void element contains children. Let the browser handle that. A custom defined element that contains illegal characters? Also let the browser handle this.

I also don't think a enumeration of "allowable" tag names will do much good, if it's for more than debug-assertions and linting. It will eventually get out of date, besides the number of svg elements already being rather large. This list doesn't have to leak into the final binary.

For capitalization, I'm not even sure about the rules the browsers currently implement. For example, an inline <svg> without a xml namespace declaration currently seems to follow normal HTML rules for contained elements, in Firefox. I don't think we can or should reproduce this rule set in yew. Just try our best to warn the programmer, follow an internal standard (i.e. all tag names are lowercase when coming from the literal macro) and reconcile in a straight-foward way, without attempting to normalize input we have no control over - at least in production mode.

Now, the special casing for "input" and "textarea" et al should stay and work regardless of casing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-yew Area: The main yew crate A-yew-macro Area: The yew-macro crate proposal
Projects
None yet
Development

No branches or pull requests

7 participants