The fast, most optimal, and correct HTML & XML parsing library
Documentation | Releases | Benchmarks
MarkupEver is a modern, fast (high-performance), XML & HTML languages parsing library written in Rust.
KEY FEATURES:
- 🚀 Fast: Very high performance and fast (thanks to html5ever and selectors).
- 🔥 Easy: Designed to be easy to use and learn. Completion everywhere.
- ✨ Low-Memory: Written in Rust. Uses low memory. Don't worry about memory leaks. Uses Rust memory allocator.
- 🧶 Thread-safe: Completely thread-safe.
- 🎯 Quering: Use your CSS knowledge for selecting elements from a HTML or XML document.
You can install MarkupEver by using pip:
It's recommended to use virtual environments.
$ pip3 install markupever
Parsing a HTML content and selecting elements:
import markupever as mr
dom = mr.parse_file("file.html", mr.HtmlOptions())
# Or parse a HTML content directly:
# dom = markupever.parse("... content ...", mr.HtmlOptions())
for element in dom.select("div.section > p:child-nth(1)"):
print(element.text())
Creating a DOM from zero:
from markupever import dom
dom = dom.TreeDom()
root: dom.Document = dom.root()
root.create_doctype("html")
html = root.create_element("html", {"lang": "en"})
body = html.create_element("body")
body.create_text("Hello Everyone ...")
print(root.serialize())
# <!DOCTYPE html><html lang="en"><body>Hello Everyone ...</body></html>
- Rewrite TreeDom
__repr__
and__str__
- Write benchmarks
- Write memory usage
- Add PyPI version, test coverage, and python versions badges
- Complete docs
- Add prettier feature
- Provide more control on serializer
- Add advanced examples to docs (such as socket and http streams)