Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hide .notdef glyphs in non-embedded Type1 fonts and don't ignore Widths #11528

Merged
merged 1 commit into from
Feb 6, 2020

Conversation

janpe2
Copy link
Contributor

@janpe2 janpe2 commented Jan 21, 2020

Fixes #11403
The PDF uses the non-embedded Type1 font Helvetica. Character codes 194 and 160 (Â and NBSP) are encoded as .notdef. We shouldn't show those glyphs because it seems that Acrobat Reader doesn't draw glyphs that are named .notdef in fonts like this.

In addition to testing glyphName === ".notdef", we must test also glyphName === "" because the name "" is used in core/encodings.js for undefined glyphs in encodings like WinAnsiEncoding.

The solution above hides the  characters but now the replacement character (space) appears to be too wide. I found out that PDF.js ignores font's Widths array if the font has no FontDescriptor entry. That happens in #11403, so the default widths of Helvetica were used as specified in core/metrics.js and .nodef got a width of 333. The correct width is 0 as specified by the Widths array in the PDF. Thus we must never ignore Widths.

In local testing the rendering of test bug1443140 has now changed. The PDF uses non-embedded fonts like Times-Roman and they have Widths but no FontDescriptor.

Even though the issue is labeled 4-printing, the problems appear also when rendered on-screen.

@Snuffleupagus
Copy link
Collaborator

I'll happily defer the actual review to @brendandahl, however one small suggestion:
Since you included a lot of really good information in #11528 (comment), mind adding that to the commit message as well such that it's available e.g. on the Git command line too (and not just at GitHub).

Fixes mozilla#11403
The PDF uses the non-embedded Type1 font Helvetica. Character codes 194 and 160 (`Â` and `NBSP`) are encoded as `.notdef`. We shouldn't show those glyphs because it seems that Acrobat Reader doesn't draw glyphs that are named `.notdef` in fonts like this.

In addition to testing `glyphName === ".notdef"`, we must test also `glyphName === ""` because the name `""` is used in `core/encodings.js` for undefined glyphs in encodings like `WinAnsiEncoding`.

The solution above hides the `Â` characters but now the replacement character (space) appears to be too wide. I found out that PDF.js ignores font's `Widths` array if the font has no `FontDescriptor` entry. That happens in mozilla#11403, so the default widths of Helvetica were used as specified in `core/metrics.js` and `.nodef` got a width of 333. The correct width is 0 as specified by the `Widths` array in the PDF. Thus we must never ignore `Widths`.
@janpe2 janpe2 force-pushed the type1-nonemb-notdef branch from 0374656 to 809b96b Compare January 21, 2020 19:45
@janpe2
Copy link
Contributor Author

janpe2 commented Jan 21, 2020

To justify my solution, see the source code of Apache PDFBox:
https://github.com/apache/pdfbox/blob/da6abfdd2f518fd9cbc1dfce330a7896fd7d2ed9/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/font/PDType1Font.java#L626-L633
(The comment for embedded fonts is not true.)

@Snuffleupagus
Copy link
Collaborator

/botio-linux preview

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_preview from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/42d3fcfd1c1d9b0/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/42d3fcfd1c1d9b0/output.txt

Total script time: 1.75 mins

Published

@brendandahl
Copy link
Contributor

/botio test
FYI: there will likely be some intermittent failures on chrome.

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @brendandahl received. Current queue size: 0

Live output at: http://54.67.70.0:8877/a62bf82a06278dc/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @brendandahl received. Current queue size: 0

Live output at: http://54.215.176.217:8877/a56426189b59dae/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://54.215.176.217:8877/a56426189b59dae/output.txt

Total script time: 25.88 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://54.215.176.217:8877/a56426189b59dae/reftest-analyzer.html#web=eq.log

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/a62bf82a06278dc/output.txt

Total script time: 60.00 mins

@Snuffleupagus
Copy link
Collaborator

/botio-linux test

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @Snuffleupagus received. Current queue size: 0

Live output at: http://54.67.70.0:8877/30280676c1b342b/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Failed

Full output at http://54.67.70.0:8877/30280676c1b342b/output.txt

Total script time: 19.45 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://54.67.70.0:8877/30280676c1b342b/reftest-analyzer.html#web=eq.log

@brendandahl
Copy link
Contributor

/botio makeref

@pdfjsbot
Copy link

pdfjsbot commented Feb 5, 2020

From: Bot.io (Windows)


Received

Command cmd_makeref from @brendandahl received. Current queue size: 0

Live output at: http://54.215.176.217:8877/27e92d59e1425b4/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Feb 5, 2020

From: Bot.io (Linux m4)


Received

Command cmd_makeref from @brendandahl received. Current queue size: 0

Live output at: http://54.67.70.0:8877/ee00492c6233b67/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Feb 5, 2020

From: Bot.io (Windows)


Failed

Full output at http://54.215.176.217:8877/27e92d59e1425b4/output.txt

Total script time: 2.54 mins

  • Lint: Passed
  • Make references: FAILED

@brendandahl
Copy link
Contributor

/botio-windows makeref

@pdfjsbot
Copy link

pdfjsbot commented Feb 5, 2020

From: Bot.io (Windows)


Received

Command cmd_makeref from @brendandahl received. Current queue size: 0

Live output at: http://54.215.176.217:8877/b8f3f2f91e6fc29/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Feb 5, 2020

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/ee00492c6233b67/output.txt

Total script time: 17.99 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@pdfjsbot
Copy link

pdfjsbot commented Feb 5, 2020

From: Bot.io (Windows)


Success

Full output at http://54.215.176.217:8877/b8f3f2f91e6fc29/output.txt

Total script time: 25.09 mins

  • Lint: Passed
  • Make references: Passed
  • Check references: Passed

@brendandahl brendandahl merged commit 09a6e17 into mozilla:master Feb 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Invalid characters are printing in PDF when we have &nbsp - NBSP (non-breaking space)
5 participants