A C++ header only library for parsing a HTML source.
In order to use this library, just include the header, like:
#include "DomTree.h"
You can parse a HTML string as follows:
#include "DomTree.h"
std::ifstream ifs(std::filesystem::current_path().generic_string() + "/html/style_with_comments.html");
std::string html_file((std::istreambuf_iterator<char>(ifs)),
(std::istreambuf_iterator<char>()));
CDomTree dt{};
dt.Parse(std::move(html_file));
You can easily generate a HTML source using this:
#include "DomTree.h"
constexpr std::string_view html_style = R"(
body {
font-family: Arial;
color: #f9f9ff;
background-color:#161B1F;
}
)";
CDomTree dom{};
dom.GetTags().push_back(std::make_shared<Tag>("!DOCTYPE html"));
Tag head("head");
head.AddChild({ "meta", { {{"http-equiv"}, {"X-UA-Compatible"}}, {{"content"}, {"IE=edge"}} } });
head.AddChild({ "meta", { {{"http-equiv"}, {"content-type"}}, {{"content"}, {"text/html; charset=utf-8"}} } });
head.AddChild({ "meta", { {{"name"}, {"viewport"}}, {{"content"}, {"width=device-width, initial-scale=1"}} } });
head.AddChild({ "style", html_style.data() });
dom.GetTags().push_back(std::make_shared<Tag>(std::move(head)));
// return HTML source as string
std::clog << dom.GetData() << std::endl;
The project parse several HTML sources using google test, the outcome is like:
[==========] Running 16 tests from 3 test suites.
[----------] Global test environment set-up.
[----------] 5 tests from TestInvalidTable
[ RUN ] TestInvalidTable.invalidSmallTable
[ OK ] TestInvalidTable.invalidSmallTable (0 ms)
[ RUN ] TestInvalidTable.invalidHugeTable
[ OK ] TestInvalidTable.invalidHugeTable (82 ms)
[ RUN ] TestInvalidTable.imbricatedTable
[ OK ] TestInvalidTable.imbricatedTable (1 ms)
[ RUN ] TestInvalidTable.imbricatedInvalidTablesSmall
[ OK ] TestInvalidTable.imbricatedInvalidTablesSmall (0 ms)
[ RUN ] TestInvalidTable.imbricatedInvalidTables
[ OK ] TestInvalidTable.imbricatedInvalidTables (0 ms)
[----------] 5 tests from TestInvalidTable (90 ms total)
[----------] 6 tests from TestBigSite
[ RUN ] TestBigSite.modernescpp_com
[ OK ] TestBigSite.modernescpp_com (57 ms)
[ RUN ] TestBigSite.codingforums
[ OK ] TestBigSite.codingforums (4 ms)
[ RUN ] TestBigSite.myradioonline_ro
[ OK ] TestBigSite.myradioonline_ro (29 ms)
[ RUN ] TestBigSite.adevarul_ro
[ OK ] TestBigSite.adevarul_ro (115 ms)
[ RUN ] TestBigSite.dailymail
[ OK ] TestBigSite.dailymail (245 ms)
[ RUN ] TestBigSite.cppreference_com
[ OK ] TestBigSite.cppreference_com (36 ms)
[----------] 6 tests from TestBigSite (493 ms total)
[----------] 5 tests from TestSite
[ RUN ] TestSite.icomoon
[ OK ] TestSite.icomoon (163 ms)
[ RUN ] TestSite.multi_comments
[ OK ] TestSite.multi_comments (0 ms)
[ RUN ] TestSite.multi_spaces
[ OK ] TestSite.multi_spaces (0 ms)
[ RUN ] TestSite.multi_self_closing_tags
[ OK ] TestSite.multi_self_closing_tags (0 ms)
[ RUN ] TestSite.style_with_comments
[ OK ] TestSite.style_with_comments (0 ms)
[----------] 5 tests from TestSite (168 ms total)
[----------] Global test environment tear-down
[==========] 16 tests from 3 test suites ran. (756 ms total)
[ PASSED ] 16 tests.
If you have suggestions for improvement or if you've identified a bug, please don't hesitate to open an issue or contribute by creating a pull request. When reporting a bug, provide comprehensive details about your environment, including compiler version and other relevant information, to facilitate issue reproduction. Additionally, if you're introducing a new feature, ensure that you include corresponding test cases to validate its functionality.
No dependencies, just a C++ compiler which supports C++17. This one has been tested on Windows only.