Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitizer built-ins document #244

Merged
merged 10 commits into from
Jan 16, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .github/workflows/pr-push.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,13 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Generate safe-default-configuration.json
run: python builtins/safe-default-configuration.py --input builtins/safe-default-configuration.txt --out builtins/safe-default-configuration.json
- name: Generate safe-baseline-configuration-materialized.json
run: python builtins/safe-baseline-configuration.py --input builtins/safe-baseline-configuration.json --event-handlers builtins/event-handler-content-attributes.txt --out builtins/safe-baseline-configuration-materialized.json
- uses: w3c/spec-prod@v2
with:
GH_PAGES_BRANCH: gh-pages
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
/.project
/out
/*.ninja*
/builtins/safe-default-configuration.json
/builtins/safe-baseline-configuration-materialized.json
89 changes: 89 additions & 0 deletions builtins/event-handler-content-attributes.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
// https://html.spec.whatwg.org/#ix-event-handlers
onafterprint
onauxclick
onbeforeinput
onbeforematch
onbeforeprint
onbeforeunload
onbeforetoggle
onblur
oncancel
oncanplay
oncanplaythrough
onchange
onclick
onclose
oncontextlost
oncontextmenu
oncontextrestored
oncopy
oncuechange
oncut
ondblclick
ondrag
ondragend
ondragenter
ondragleave
ondragover
ondragstart
ondrop
ondurationchange
onemptied
onended
onerror
onfocus
onformdata
onhashchange
oninput
oninvalid
onkeydown
onkeypress
onkeyup
onlanguagechange
onload
onloadeddata
onloadedmetadata
onloadstart
onmessage
onmessageerror
onmousedown
onmouseenter
onmouseleave
onmousemove
onmouseout
onmouseover
onmouseup
onoffline
ononline
onpagehide
onpagereveal
onpageshow
onpageswap
onpaste
onpause
onplay
onplaying
onpopstate
onprogress
onratechange
onreset
onresize
onrejectionhandled
onscroll
onscrollend
onsecuritypolicyviolation
onseeked
onseeking
onselect
onslotchange
onstalled
onstorage
onsubmit
onsuspend
ontimeupdate
ontoggle
onunhandledrejection
onunload
onvolumechange
onwaiting
onwheel
Empty file.
33 changes: 33 additions & 0 deletions builtins/safe-baseline-configuration.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
{
"removeElements": [
{
"namespace": "http://www.w3.org/1999/xhtml",
"name": "script"
},
{
"namespace": "http://www.w3.org/1999/xhtml",
"name": "frame"
},
{
"namespace": "http://www.w3.org/1999/xhtml",
"name": "iframe"
},
{
"namespace": "http://www.w3.org/1999/xhtml",
"name": "object"
},
{
"namespace": "http://www.w3.org/1999/xhtml",
"name": "embed"
},
{
"namespace": "http://www.w3.org/2000/svg",
"name": "script"
},
{
"namespace": "http://www.w3.org/2000/svg",
"name": "use"
}
],
"removeAttributes": []
}
39 changes: 39 additions & 0 deletions builtins/safe-baseline-configuration.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Sanitizer API - Build configuration dictionary from text file.

import json
import argparse
import sys

def main():
parser = argparse.ArgumentParser()
parser.add_argument("--input", type=argparse.FileType('r'), required=True)
parser.add_argument("--event-handlers", type=argparse.FileType('r'),
required=True)
parser.add_argument("--out", type=argparse.FileType('w'), required=True)
args = parser.parse_args()

try:
config = json.load(args.input)
except BaseException as err:
parser.error("Cannot read from --input file.")

try:
events = args.event_handlers.read()
except BaseException as err:
parser.error("Cannot read from --event-handlers file.")

for event in events.split("\n"):
if not event:
continue
if event.startswith("//"):
continue
config["removeAttributes"].append(event)

try:
json.dump(config, args.out, indent=2)
except BaseException as err:
parser.error("Cannot write to --out file.")
return 0

if __name__ == "__main__":
main()
Empty file.
42 changes: 42 additions & 0 deletions builtins/safe-default-configuration.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Sanitizer API - Build configuration dictionary from text file.

import json
import argparse
import sys

def main():
parser = argparse.ArgumentParser()
parser.add_argument("--input", type=argparse.FileType('r'), required=True)
parser.add_argument("--out", type=argparse.FileType('w'), required=True)
args = parser.parse_args()

try:
lines = args.input.read()
except BaseException as err:
parser.error("Cannot read from --input file.")

result = { "elements": [], "attributes": [] }
current = []
for line in lines.split("\n"):
if not line:
pass
elif line.startswith("//"):
pass
elif line.startswith("- "):
current.append({ "name": line[2:], "namespace": None })
elif line == "[HTML Global]":
current = result["attributes"]
else:
elem = { "name": line, "namespace": "http://www.w3.org/1999/xhtml",
"attributes": [] }
result["elements"].append(elem)
current = elem["attributes"]

try:
json.dump(result, args.out, indent=2)
except BaseException as err:
parser.error("Cannot write to --out file.")
return 0

if __name__ == "__main__":
main()
171 changes: 171 additions & 0 deletions builtins/safe-default-configuration.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
// Document element
// https://html.spec.whatwg.org/#the-root-element

html

// Document metadata
// https://html.spec.whatwg.org/#document-metadata

head
title

// meta and link, purposely omitted

// Sections
// https://html.spec.whatwg.org/#sections

body
article
section
nav
aside
h1
h2
h3
h4
h5
h6
hgroup
header
footer
address

// Grouping Content
// https://html.spec.whatwg.org/#grouping-content

p
hr
pre
blockquote
- cite
ol
- reversed
- start
- type
ul
menu
li
- value
dl
dt
dd
figure
figcaption
main
search
div

// Text-level Semantics
// https://html.spec.whatwg.org/#text-level-semantics ###

a
- href
- rel
- hreflang
- type
// Purposely omitted:
// - target
// - download
// - referrerpolicy
// - ping
em
strong
small
s
cite
q
dfn
- title
abbr
- title
ruby
rt
rp
data
- value
time
- datetime
code
var
samp
kbd
sub
sup
i
b
u
mark
bdi
- dir
bdo
- dir
span
br
wbr

// Edits
// https://html.spec.whatwg.org/#edits

ins
- cite
- datetime
del
- cite
- datetime

// Embedded content
// https://html.spec.whatwg.org/#embedded-content
//
// Purposely omitted.

// Tabular Data
// https://html.spec.whatwg.org/#tables

table
caption
colgroup
- span
col
- span
tbody
thead
tfoot
tr
td
- colspan
- rowspan
- headers
th
- colspan
- rowspan
- headers
- scope
- abbr

// Forms
// https://html.spec.whatwg.org/#forms
//
// Purposely omitted

// Interactive Elements
// https://html.spec.whatwg.org/#interactive-elements
//
// Purposly omitted.

// Scripting
// https://html.spec.whatwg.org/#scripting
//
// Purposely omitted.

// SVG: TBD
// MathML: TDB

// HTML global attributes
//
// Selection of attributes. Most are purposely omitted.

[HTML Global]
- dir
- lang
- title

Loading
Loading