-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Take the /CIDToGIDMap into account when getting the glyph mapping for CFF fonts (issue 15559) #15563
Conversation
/botio test |
From: Bot.io (Windows)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.193.163.58:8877/1f6c4666e962756/output.txt |
From: Bot.io (Linux m4)ReceivedCommand cmd_test from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.241.84.105:8877/2f7cca3840a6ca4/output.txt |
From: Bot.io (Linux m4)FailedFull output at http://54.241.84.105:8877/2f7cca3840a6ca4/output.txt Total script time: 25.34 mins
Image differences available at: http://54.241.84.105:8877/2f7cca3840a6ca4/reftest-analyzer.html#web=eq.log |
From: Bot.io (Windows)FailedFull output at http://54.193.163.58:8877/1f6c4666e962756/output.txt Total script time: 29.42 mins
Image differences available at: http://54.193.163.58:8877/1f6c4666e962756/reftest-analyzer.html#web=eq.log |
|
||
if (invCidToGidMap && invCidToGidMap[charCode] !== undefined) { | ||
charCode = invCidToGidMap[charCode]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know anything about that stuff, so I'm trying understand.
The function getGlyphMapping
is getting a map CharCode=>GID.
The cmap is a CID<=>CharCode map and invCidToGid map is finally a GID=>CID map so if I understand correctly, the charCode we get at l. 75 from the cmap would be a GID and the resulting charCode at l. 78 would be a CID... sorry it doesn't make sense for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in #15563 (comment), I don't really understand this well enough to know how it should work.
However this patch is the only way that I could come up with that actually fixes the issue, since trying to apply the /CIDToGIDMap elsewhere in this function doesn't actually fix the issue :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have (25573 is the char code for the first ideograph just after FutureABC L3):
- charsets: 25573 (gid) => 40307 (cid)
- cidToGidMap: 25573 (cid) => 40307 (gid)
- charsets: 17688 (gid) => 25573 (cid)
If I remove the /CidToGidMap
entry for the font in the pdf then the rendering is incorrect in either Chrome and Acrobat so this map really matters.
A possibility is that the gid in charsets
is a detail of implementation and the exposed id is finally the cid from charsets which could be a kind of gid
, hence cidToGidMap
would map from a cid in the pdf to the kind of gid in the font.
I'm loosing myself when I read the above sentence...
Anyway, your patch could make sense.
I'm not sure if we should apply invCidToGid
to cid
or to charCode
, wdyt ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it's interesting to read pdium code:
https://pdfium.googlesource.com/pdfium/+/refs/heads/main/core/fpdfapi/font/cpdf_cidfont.cpp#689
and the value from CIDToGIDMap
is returned here: https://pdfium.googlesource.com/pdfium/+/refs/heads/main/core/fpdfapi/font/cpdf_cidfont.cpp#824
and it seems to be the gid
and the only way to get the correct final glyph id is to reverse the value from charsets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And we should really add some comments to explain why we're doing that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm loosing myself when I read the above sentence...
Yeah, all of this is really difficult to get your head around. Looking through both the PDF and the CFF specifications doesn't really help either.
According to the relevant section in the PDF spec it even sounds like these fonts shouldn't be using a /CIDToGIDMap map (but it wouldn't be the first time that the PDF spec is incomplete/incorrect), since the relevant fonts are actually CFF and not TrueType (excerpt below):
This entry may appear only in a Type 2 CIDFont whose associated
TrueType font program is embedded in the PDF file.
I'm not sure if we should apply
invCidToGid
tocid
or tocharCode
, wdyt ?
Based on quick testing, changing that doesn't seem to work unfortunately.
And we should really add some comments to explain why we're doing that.
I'm really struggling to explain what's happening here, do you have any suggestions as to its wording?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really struggling to explain what's happening here, do you have any suggestions as to its wording?
If only I was able to....
Maybe just copy/paste the example I put in #15563 (comment)
and something like "it seems that the GID in CIDToGIDMap corresponds to the CID in the CFF font".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried to add a brief comment, and hopefully anyone curious about further details can reference this issue via blame
(adding details from a specific font and glyph to the code doesn't really feel all that much clearer, at least to me).
… CFF fonts (issue 15559) *Please note:* I don't really know what I'm doing here, however the patch appears to fix the referenced issue when comparing the rendering with Adobe Reader (with the caveat that I don't speak the language in question).
0ca7105
to
858d941
Compare
/botio-linux preview |
From: Bot.io (Linux m4)ReceivedCommand cmd_preview from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.241.84.105:8877/99518b1f4a06daa/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.241.84.105:8877/99518b1f4a06daa/output.txt Total script time: 2.09 mins Published |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fixing a bug and there are no regressions, so it's likely not that bad.
/botio makeref |
From: Bot.io (Linux m4)ReceivedCommand cmd_makeref from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.241.84.105:8877/e9ead3d2ee0b9f7/output.txt |
From: Bot.io (Linux m4)SuccessFull output at http://54.241.84.105:8877/e9ead3d2ee0b9f7/output.txt Total script time: 22.07 mins
|
@calixteman It looks like the Windows bot has stopped responding altogether, can you please take a look? |
Please note: I don't really know what I'm doing here, however the patch appears to fix the referenced issue when comparing the rendering with Adobe Reader (with the caveat that I don't speak the language in question).