Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8 Encoding not applied if content contains inline svg graphics #46

Closed
heldchen opened this issue Jan 17, 2024 · 1 comment
Closed

Comments

@heldchen
Copy link

when scraping a website that contains inlined svg graphics, the loadContent() function fails to apply the correct encoding as the <?xml version="1.0" encoding="UTF-8"?> of the inlined graphic is preventing adding of encoding header in

if (!$this->xml_mode && $encoding && stripos($content, '<?xml') === false) {

a quick fix would be to just rely on the already set $this->xml_print_pi property:

        if (!$this->xml_print_pi && $encoding) {
            $content = '<?xml encoding="'.$encoding.'">'.$content; // add pi node to make libxml use the correct encoding
            $xml_pi_node_added = true;
        }
@Rct567 Rct567 closed this as completed in b3357e7 Jan 18, 2024
@heldchen
Copy link
Author

thanks alot for the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant