-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support UTF-16 little-endian strings in the stringToPDFString
helper function (bug 1593902)
#11307
Support UTF-16 little-endian strings in the stringToPDFString
helper function (bug 1593902)
#11307
Conversation
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.215.176.217:8877/55be147199747ae/output.txt |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/3e33c69275b4164/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/3e33c69275b4164/output.txt Total script time: 18.65 mins
|
From: Bot.io (Windows)SuccessFull output at http://54.215.176.217:8877/55be147199747ae/output.txt Total script time: 26.61 mins
|
…r function (bug 1593902) The bug report seem to suggest that we don't support UTF-16 strings with a BOM (byte order mark), which we *actually* do as evident by both the code and a unit-test. The issue at play here is rather that we previously only supported big-endian UTF-16 BOM, and the `Title` string in the PDF document is using a *little-endian* UTF-16 BOM instead. Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1593902
6c78b5d
to
80342e2
Compare
Forgot to extend one of the existing unit-tests... /botio unittest |
From: Bot.io (Linux m4)InvalidCommand not implemented: |
From: Bot.io (Windows)InvalidCommand not implemented: |
From: Bot.io (Linux m4)ReceivedCommand cmd_unittest from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/b102711ac88ee97/output.txt |
From: Bot.io (Windows)ReceivedCommand cmd_unittest from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.215.176.217:8877/3dbfc51b236a1fd/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/b102711ac88ee97/output.txt Total script time: 2.66 mins
|
From: Bot.io (Windows)SuccessFull output at http://54.215.176.217:8877/3dbfc51b236a1fd/output.txt Total script time: 5.28 mins
|
/botio-linux preview |
From: Bot.io (Linux m4)ReceivedCommand cmd_preview from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/ed28aaa58caba3d/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/ed28aaa58caba3d/output.txt Total script time: 1.69 mins Published |
Nice find! |
The bug report seem to suggest that we don't support UTF-16 strings with a BOM (byte order mark), which we actually do as evident by both the code and a unit-test.
The issue at play here is rather that we previously only supported big-endian UTF-16 BOM, and the
Title
string in the PDF document is using a little-endian UTF-16 BOM instead.Fixes https://bugzilla.mozilla.org/show_bug.cgi?id=1593902
Edit: The PDF spec only mentions big endian as supported (in addition to
PDFDocEncoding
of course), see https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf#G6.1957385, hence this is actually a case where a PDF generator is (yet again) creating corrupt/invalid data.