Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dict.get is not a function #7303

Closed
danlester opened this issue May 9, 2016 · 7 comments · Fixed by #11231
Closed

dict.get is not a function #7303

danlester opened this issue May 9, 2016 · 7 comments · Fixed by #11231

Comments

@danlester
Copy link

There is a problem rendering this PDF:
ws_protectyourwork_e.pdf

Using the standard web viewer example (build/generic/web/viewer.html) it runs into an exception, showing the error:

PDF.js v1.5.232 (build: a682cce)
Message: dict.get is not a function

screen shot 2016-05-09 at 11 06 41

This happens in Chrome, Firefox, Safari on the Mac - haven't tried any others. Message on Safari is: dict.get is not a function. (In 'dict.get('Filter')', 'dict.get' is undefined)

I hope this helps. Not sure if there is a problem with the PDF but I guess it would ideally catch the error anyway. PDF seems to display OK in other software.

Thanks,

Dan

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented May 9, 2016

This is a unfortunately a regression from PR #5910.

It seems to me that we need a _much_ more robust way of trying to recover valid XRef data from concatenated PDF files, rather than just relying on a simple condition[1]. (I'm actually a little bit surprised that this hasn't caused more issues in practice.)


[1]

if (typeof this.entries[m[1]] === 'undefined') {

@yurydelendik
Copy link
Contributor

rather than just relying on a simple condition

@Snuffleupagus does it miss generation check? or shall we track the latest obj with specific number instead of first?

@yurydelendik
Copy link
Contributor

PDF seems to display OK in other software.

Adobe Reader asking to re-save the opened PDF, this means PDF was corrupted and the Reader recovered it.

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented May 9, 2016

does it miss generation check?

I don't think so, since off the top of my head all entries have gen === 0.
The problem here is that the PDF file is actually two separate PDF files placed in just one file, i.e. a completely busted PDF file (in the eyes of the specification).

or shall we track the latest obj with specific number instead of first?

I'm not sure how we can solve this in general, since in this case there are e.g. two distinct 76 0 obj entires, one in the "first" part of the file and one in the "second" part.

@yurydelendik
Copy link
Contributor

Best solution will be to determine what the Reader does, I guess. However we need to understand how the file was created and intent of the generator. @danlester can you provide history of the PDF?

@yurydelendik
Copy link
Contributor

yurydelendik commented May 9, 2016

I'm not sure how we can solve this in general, since in this case there are e.g. two distinct 76 0 obj entires, one from the "first" file and one from the "second one.

We shall take the one that is placed before the trailer that had catalog object reference. This means we shall not commit to found objects until next trailer is found (if not found it at all we just use what we found)

@danlester
Copy link
Author

Sorry I missed your notification. I'm afraid I don't know much about the PDF history anyway - it wasn't my file originally. Will see if I can find anything out, but probably not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants