v0.4.4
New publishers for India, Switzerland, and Australia
With this release, we added 3 new publishers, updated several existing ones, and added some QoL functionality for DEVs
Publishers
New
- IND:
Bhaskar
(@MaxDall in #605) - CH:
TagesAnzeiger
(@MaxDall in #608) - AU:
TheWestAustralian
(@MaxDall in #615)
Updates
- DE:
SportSchau
(@addie9800 in #611) - FR:
LesEchos
is now deprecated (@MaxDall in #617) - UK:
TheTelegraph
(@MaxDall in #616)
What's new?
We implemented XPath queries for LinkedDataMaping
to search through the data more fine-grained (@MaxDall in #614). Further, we now parse crawl-delays from publisher-given robots.txt
files, which can be omitted through the crawler (@MaxDall in #609). Additionally, we ...
- Ignore
robots.txt
in coverage script by @MaxDall in #610 - Adjust
generic_topic_parsing
to return only unique topics by @MaxDall in #620
Bug fixes
Full Changelog: v0.4.3...v0.4.4