You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
// Check validity of input flags.
if (FLAGS_input_unicharset.empty() || FLAGS_script_dir.empty() ||
FLAGS_output_dir.empty() || FLAGS_lang.empty()) {
tprintf("Usage: %s --input_unicharset filename --script_dir dirname\n",
argv[0]);
tprintf(" --output_dir rootdir --lang lang [--lang_is_rtl]\n");
tprintf(" [--words file --puncs file --numbers file]\n");
tprintf("Sets properties on the input unicharset file, and writes:\n");
tprintf("rootdir/lang/lang.charset_size=ddd.txt\n");
tprintf("rootdir/lang/lang.traineddata\n");
tprintf("rootdir/lang/lang.unicharset\n");
tprintf("If the 3 word lists are provided, the dawgs are also added to");
tprintf(" the traineddata file.\n");
tprintf("The output unicharset and charset_size files are just for human");
tprintf(" readability.\n");
However, the actual info displayed is
USAGE: combine_lang_model
--lang_is_rtl True if lang being processed is written right-to-left (type:bool default:false)
--pass_through_recoder If true, the recoder is a simple pass-through of the unicharset. Otherwise, potentially a compre
ssion of it (type:bool default:false)
--input_unicharset Unicharset to complete and use in encoding (type:string default:)
--script_dir Directory name for input script unicharsets (type:string default:)
--words File listing words to use for the system dictionary (type:string default:)
--puncs File listing punctuation patterns (type:string default:)
--numbers File listing number patterns (type:string default:)
--output_dir Root directory for output files (type:string default:)
--version_str Version string to add to traineddata file (type:string default:)
--lang Name of language being processed (type:string default:)
So, it looks like that the program is calling a common training argument parser and exiting.
@Shreeshrii : if you read it carefully you would see that it print "almost" the same information but in different order. Only additional information (not relevant to run command are:
Sets properties on the input unicharset file, and writes:
rootdir/lang/lang.charset_size=ddd.txt
rootdir/lang/lang.traineddata
rootdir/lang/lang.unicharset
If the 3 word lists are provided, the dawgs are also added to the traineddata file.
The output unicharset and charset_size files are just for human readability.
Usage instructions are given in https://github.com/tesseract-ocr/tesseract/blob/master/training/combine_lang_model.cpp#L43-58
However, the actual info displayed is
So, it looks like that the program is calling a common training argument parser and exiting.
https://github.com/tesseract-ocr/tesseract/blob/master/training/combine_lang_model.cpp#L40
Related: #1297
The text was updated successfully, but these errors were encountered: