-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
White glyphs when selecting ocr-text in Evince #249
Comments
Found this: https://unix.stackexchange.com/questions/306051/tesseract-is-it-possible-to-change-font-output-in-ocred-pdf Do I really need to rebuild tesseract for this? Is there no other way around with OCRmyPDF? |
This does only happen with scanned pages? |
At a glance it looks like that version of evince can't display the Tesseract glyphless font correctly, although that would be surprising since it's been checked. Are you using Tesseract 4? What command line? Try changing the PDF renderer Try building a regular PDF instead of PDF/A Can you use Tesseract on an image to create a PDF and view that PDF? |
Thanks for your help:
|
What Linux and version of evince? This might be something that the maintainers of Tesseract or evince need to take up. |
Evince Version: 3.26.0+14+g2a499547-1 |
I have the same problem. looks like the problem lies in ghostscript: tesseract-ocr/tesseract#712 |
@bitwave The Ghostscript issue is fixed now. An older version of Tesseract/Ghostscript will still be problematic, but I managed to replicate the problem with gs 9.23. I believe the issue is with Evince itself, so I reported it there. |
I'm still seeing this issue in poppler-based viewers. Is there any workaround available? |
@titaniumbones as mentioned above, change the renderer to |
where do I add this switch ( THanks for the help! |
Ah shoot of course you meant in ocrmypdf. That is really helpful. |
Thank you all, this helped me fix what I thought was a deficiency in OCRmyPDF. For me the boxes were black not white, and manifested in all viewers whether Evince, Okular, or PDF-Tools in Emacs. Adding --pdf-renderer hocr makes highlighted text visible again. Beautiful! |
https://gitlab.freedesktop.org/poppler/poppler/merge_requests/280 @julian-klode, can you please do what the maintainer asked for in your PR? |
I'd love too but I broke my wrist and the other side elbow, so I'm a bit incapacitated |
Hope you will feel better soon :-) |
Problem in evince pdf reader:
It only happens when selecting. Is this a display failure? missing fonts? otherwise ocr text is correct.
Similar to #178?
The text was updated successfully, but these errors were encountered: