-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compiled-in TESSDATA_PREFIX unused on Windows #3767
Comments
Initialization of |
If there is a |
Your are right. This is easy to fix. Do you want to send a pull request which does that? |
Yes no problem, I will prepare a fix. |
Fixes #3767. Co-authored-by: Stefan Weil <[email protected]>
One again: it is not but but intentional design. |
On any platform I can copy the data to any path and forward this to the library. On any platform, I can set the |
That's correct for 99 % of Windows users. But if someone wants to build a personal Tesseract which installs |
Current behavior:
Am I right? |
|
Before our fix, the Windows-exclusive behavior to try to load the files from the Now, the following priorities are used:
Before, option 4 was excluded just by checking option 3 and thus Windows-exclusive. Therefore, for non-Windows platforms the behavior is the same as without this fix. |
OK. After looking again, I now see that my analysis of the previous state was wrong. |
Hello!
We tried to compile Tesseract including a compiled-in TESSDATA_PREFIX using a respective preprocessor definition.
However, compiling
ccutil.cpp
on Windows with defined_WIN32
always causes the codeto be executed and, if
datadir
is currently empty, it will be initialized as<Path to binary>/tessdata
.Thus, the code that would initialize
datadir
according to the compile-time constantTESSDATA_PREFIX
is not executed. Most likely on Unix systems this bug is not present because here,_WIN32
is not defined.Environment
Current Behavior:
Compiling Tesseract with defined
TESSDATA_PREFIX=C:\Path\to\somewhere
and starting tesseract.exe with an attached debugger as well as only the supplied command line argument--list-langs
does not find Tesseract's language files, even if they exist in a folder 'tessdata' in the respective compiled-in directory.Expected Behavior:
As long as neither an other data directory is explicitly given nor does the directory
tessdata
next to the executing binary exist but the compiled-inTESSDATA_PREFIX
does, it should be used.Suggested Fix:
In my understanding of the code, the
tessdata
directory should be resolved according to the following rules:argv0
overrules everything.TESSDATA_PREFIX
environment variable is used.tessdata
directory next to the executing binary is used.TESSDATA_PREFIX
should be checked../
).I think the easiest option is to move alternative 3 into a function that on Windows does the same as the
#if defined(_WIN32)
part while on Unix, it does a similar check (even though this is not currently implemented). If thetessdata
directory next to the binary exists, it is used and saved in thedatadir
variable and the function returns true. Then, the secondelse if
simply calls this function and if it returns false, the additional else part used the compiled-inTESSDATA_PREFIX
if this is set.The text was updated successfully, but these errors were encountered: