Update handling of quoteStart to prevent sanitization bypass #201
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There is a sanitization failure scenario impacting cases where the
onTag
function is used to customise how certain tags are handled. The second parameter to this function is the HTML of the opening tag, complete with attributes, and is susceptible to malformed HTML which means it is not sanitized before being passed to the function. If theonTag
customisation function then uses this HTML as part of the output then an XSS is possible.Sanitizer logic error
The error is at lines 88-91 of the parser:
The tokenizer goes into the 'inside quoted value' state when it encounters a
"
or'
character that is immediately preceded by a=
.However if we supply this input:
The
target
attribute has a space after the=
character, so the tokenizer fails to enter the 'inside quoted value' state. It then continues parsing until it meetshref="
at which point if erroneously enters the 'inside quoted value' state and parses the subsequent<script>
tags as beloning to that attribute, until the"
following those tags.This allows smuggling the
<script>
tags or other malicious content into thehtml
variable that is passed as the second parameter to the customonTag
function. When the custom tag function uses that value (presuming it has been correctly sanitized) the malicious payload is injected back in the output:Browsers are robust to the
target= "
snippet including a space, so parse the DOM like so:Minimum reproduction
This is a fairly minimal reproduction of the failure case:
Patch
My suggested fix (bear in mind I'm not a JS person!) is to add a while loop that allows for spaces between
=
and the quote starting an attribute value. All the tests pass still. I have a PR prepared that also adds a test case for this scenario, but didn't want to submit the PR publicly until I spoke to you.An alternative approach would try to keep track of whether the last character, ignoring whitespace was an
=
. However, when I mapped this out it added a lot of complexity.Tests
All the tests pass still, but I have also added a new test. Let me know if you think more would be sensible.