HICMA OCR Benchmarking Tool

Explore the docs

View Demo · Report Bug · Request Feature

About The Project

The HICMA Dataset Benchmarking Tool is a powerful utility designed to assess the performance of Optical Character Recognition (OCR) models on the HICMA Dataset. This tool is intended for researchers, developers, and data scientists working with OCR technologies, providing them with valuable insights into the accuracy and efficiency of their OCR models.

Key Features:

Dataset Evaluation: The OCR Benchmarking Tool allows users to easily evaluate the performance of their OCR models on the HICMA Dataset. By providing a standardized and consistent evaluation environment, users can compare different OCR systems objectively.
Metrics and Statistics: The tool offers a comprehensive set of evaluation metrics, including character error rate (CER), word error rate (WER) and Levenshtein ratio. It also provides statistical summaries to understand model performance across different images and fonts.
Configurable Parameters: Users can customize evaluation parameters to suit their specific requirements. These parameters include character recognition confidence thresholds, metrics choice and image pre-processing options.
Visualizations: The HICMA OCR Benchmarking Tool generates interactive visualizations and charts to help users visualize the results more intuitively. These visualizations aid in identifying patterns and areas for improvement in the OCR models (TODO).
Easy Integration: The tool is designed to seamlessly integrate with popular OCR frameworks and libraries (Tesseract OCR, Kraken, EasyOCR, etc...), making it convenient for users to benchmark their existing models without significant code modifications.
Reproducibility: The tool ensures reproducibility by saving evaluation results and summmaries. This allows users to review and share their experiments with others easily.

For more information regarding the dataset and its benchmarking tools, check out the paper accepted in Proceedings of ArabicNLP 2023, colocated with EMNLP 2023, found here.

Built With

Getting Started

To get a local copy up and running follow these simple steps:

Prerequisites

Depending on the OCR model you are benchmarking, you need to make sure to set up the environment according to the OCR model documentation:

Tesseract OCR : make sure to set up Tesseract OCR on your machine by following the documentation here. Make sure to also provide the link to the pretrained model you would like to use. Links to Pretrained Tesseract OCR models on Arabic Text used in this benchmark can be found here.
Kraken : Make sure to also provide the link to the pretrained model you would like to use. Links to Pretrained Kraken models on Arabic Text used in this benchmark can be found here.

Installation

Clone the repo

git clone https://github.com/anisdismail/HICMA-benchmark

Change to the project repositry:
```
cd HICMA-benchmark
```
Run the following command to create the instant the required packages to the environment:

pip install -r requirements.txt

Usage

To start the benchmark script, run the following command:
```
python benchmark.py --config config.json
```

You can modify the experiment parameters in the config.json file:

{
 "general": {
     "save_dir": "",
     "metrics": [],
     "model": "",
     "data_dir": ""
 },
 "image_processing": {
     "image_processing_width": null,
     "image_processing_height": null,
     "binarize": true,
     "grayscale": true
 }
}

Please note that the benchmark tool will only work with one model at a time, therefore make sure to choose only one of the following models at a time and add it to the config.json. If you provide more than one model in the same config file, the benchmarking script will only use the first.

  "TesseractOCR": {
      "tesseract_psm": 13,
      "tesseract_oem": 1,
      "trained_model_url": ,
      "TesseractOCR_path":
  },
  "Kraken": {
      "kraken_url": [],
      "text_direction": "horizontal-lr"
  },
  "EasyOCR": {
      "easyocr_detail": 0
  }

You can run the script from the command line and it will parse the command line arguments based on the given parameters. For example:

python benchmark.py --save_dir save_dir --metrics "CER" "WER" "Levenshtein_ratio" --model "model" --data_dir "" --image_processing_width null --image_processing_height null --binarize true --grayscale true --tesseract_psm 13 --tesseract_oem 1 --trained_model_url "" --TesseractOCR_path "" --kraken_url "" --text_direction "horizontal-lr" --easyocr_detail 0

Please note that you should only specify one model at a time in the command line argument. If you provide more than one model in the command, the benchmarking script will only use the first model.

You can also check all the possible parameters with their corresponding description using the following command:

python benchmark.py --help

It will generate the following output:

options:
  --config CONFIG
    Configurati on file path
  --save_dir SAVE_DIR
    Save directory
  --metrics METRICS [METRICS ...]
    List of metrics for benchmarking (e.g.,
    WER, CER, Levenshtein ratio)
  --model MODEL
    Model to be evaluated (e.g., Tesseract OCR,
    KrakenOCR, EasyOCR)
  --data_dir DATA_DIR
    Path to the data directory containing the dataset
  --tesseract_psm TESSERACT_PSM
    Tesseract OCR page segmentation mode (PSM)
  --tesseract_oem TESSERACT_OEM
    Tesseract OCR engine mode (OEM)
  --trained_model_url TRAINED_MODEL_URL
    Tesseract OCR trained model url
  --TesseractOCR_path TESSERACTOCR_PATH
    Tesseract OCR executable path
  --kraken_url KRAKEN_URL
    Kraken OCR trained model url
  --text_direction TEXT_DIRECTION
    Text direction for Kraken OCR (e.g. horizontal-lr)
  --easyocr_detail EASYOCR_DETAIL
    EasyOCR detail level
  --image_processing_width IMAGE_PROCESSING_WIDTH
    Width for image processing
  --image_processing_height IMAGE_PROCESSING_HEIGHT
    Height for image processing
  --binarize Binarize image
  --grayscale
    Convert image to grayscale

Roadmap

Visualizations: The HICMA OCR Benchmarking Tool should generate interactive visualizations and charts to help users visualize the results more intuitively. These visualizations aid in identifying patterns and areas for improvement in the OCR models. See the open issues for an extended list of proposed features (and known issues).

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

The HICMA dataset and its benchmark are publicly available, and are published for research purposes under the Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) licence. See LICENSE for more information.

Contact

Anis Ismail

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
resources		resources
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
benchmark.py		benchmark.py
config.json		config.json
easyocr_config.json		easyocr_config.json
eval.py		eval.py
kraken1_config.json		kraken1_config.json
kraken2_config.json		kraken2_config.json
kraken3_config.json		kraken3_config.json
metrics.py		metrics.py
models.py		models.py
ocr.py		ocr.py
requirements.txt		requirements.txt
tesseract2_config.json		tesseract2_config.json
tesseract_config.json		tesseract_config.json
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HICMA OCR Benchmarking Tool

Table of Contents

About The Project

Key Features:

Built With

Getting Started

Prerequisites

Installation

Usage

Roadmap

Contributing

License

Contact

About

Releases

Packages

Contributors 2

Languages

License

anisdismail/HICMA-benchmark

Folders and files

Latest commit

History

Repository files navigation

HICMA OCR Benchmarking Tool

Table of Contents

About The Project

Key Features:

Built With

Getting Started

Prerequisites

Installation

Usage

Roadmap

Contributing

License

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages