Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When crawling Braunschweiger Zeitung I have encountered a rather odd bug. Take this article for example: https://www.braunschweiger-zeitung.de/article238597927/Von-Natur-aus-Harzer-Honig-aus-Willensen-ist-Typisch-Harz.html with the author element of its ld_json
"author":[{"@type":"Organization","name":"FUNKE Mediengruppe","url":"https://www.harzkurier.de/autoren"}]
The previous code would throw an attribute error, that str does not have an attribute get. Going through the ld element, I could see that the author element is of type dict. After some more searching, I found that in the HTML Fundus receives, we have this:
"author":{"@type": "Person","name": "hn","url": "https://www.braunschweiger-zeitung.de/autoren/"}
I am not entirely sure, why this is happening, my guess is that it's some kind of redirection issue. Nevertheless, this PR fixes it.