-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Orientation detection "asymmetrical" #4116
Comments
Thank you for that test. Maybe that issue is related to #3021. Could you please try running |
https://digi.ub.uni-heidelberg.de/diglitData/v/ocr-orientation-test--logs.zip . A lot are 0 bytes ?! |
Added log file size to table. Does not correlate. |
The hocr output contains the skew angle of the text lines. You can try to use this info and manually reskew the image and then rerun Tesseract. |
#4070 allows for retrieving the skew calculated by Tesseract without running recognition. If you use this information to rotate the page, you will find this closes most of the accuracy gap between Tesseract and Abbyy. It has been my experience that Abbyy blows Tesseract out of the water in real-world usage, however this 90% attributable to the fact that Abbyy automatically corrects skew but Tesseract does not. If you rotate each image by the skew angle calculated by Tesseract prior to running Tesseract recognition, Tesseract performs (almost) comparably to Abbyy on high-quality documents. |
Image preprocessing (including Deskewing) is a suggested technique for a year by Tesseract docs... |
Perhaps the asymetry in recognition quality of +/- angles has simply to do with the traineddata model? |
Did you try both the |
I've used only deu.traineddata md5sum f5488b7c3186e822e0e6c5c05c1aaf1f size 15437534 |
I'll tend to close this issue and I'll think it is important to remind users, that no deskew is performed by tesseract. |
Error count for tesseract 5.3.3 (-l deu) with angles from -5 to +5 degrees (positive=clockwise) on the first page of this https://digi.ub.uni-heidelberg.de/diglitData/v/layout-fouche.pdf (400 dpi rendered b/w) Seems that primary segmentation has problems with rotated images. |
Current Behavior
Did run a 2 column german text (portrait + landscape) at (ImageMagick-)angles 0°, 90°, 180°, 270° each ± 3°, partially with ±.1° jitter.
PDF files (converted to
.tif
(400dpi, group4, using ImageMagick with options-flatten
++repage
)) (Text from Wikipedia CC BY-SA 4.0): https://digi.ub.uni-heidelberg.de/diglitData/v/gt-portrait.pdf https://digi.ub.uni-heidelberg.de/diglitData/v/gt-landscape.pdfOCR'd
.tif
s (tesseract:--psm 1
): https://digi.ub.uni-heidelberg.de/diglitData/v/ocr-orientation-test.zipThe following table contains the number of errors (according to
sdiff()
of perl module Algorithm::Diff):Expected Behavior
I've expected that 87° rotation would have nearly the same number of errors as 93°, but 93° has far more errors than 87°. Same for ±0°, ±180°, ±270°.
(Abbyy is much better at this, btw)
Suggested Fix
none
tesseract -v
tesseract 5.3.1
leptonica-1.79.0
libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.3) : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.1
Found AVX512BW
Found AVX512F
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.8 liblz4/1.9.2 libzstd/1.4.4
Found libcurl/7.68.0 NSS/3.49.1 zlib/1.2.11 brotli/1.0.7 libidn2/2.2.0 libpsl/0.21.0 (+libidn2/2.2.0) libssh/0.9.3/openssl/zlib nghttp2/1.40.0 librtmp/2.3
Operating System
Ubuntu 20.04 Focal
Other Operating System
No response
uname -a
Linux XXXXXXX 5.4.0-155-generic #172-Ubuntu SMP Fri Jul 7 16:10:02 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Compiler
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
CPU
Intel(R) Xeon(R) Gold 6150 CPU @ 2.70GHz
Virtualization / Containers
no
Other Information
No response
The text was updated successfully, but these errors were encountered: