Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ocr): added support for PaddleOCR engine #393

Closed
wants to merge 8 commits into from
Closed

feat(ocr): added support for PaddleOCR engine #393

wants to merge 8 commits into from

Conversation

Swaymaw
Copy link
Contributor

@Swaymaw Swaymaw commented Nov 20, 2024

  • Added PaddleOCR Model as an OCR engine option.
  • Added Options for configuring PaddleOCR model during document conversion using pipeline options.
  • Updates documentation, added tests and updated dependencies(extras) to reflect the added engine support.
  • Updated examples to demonstrate the use of PaddleOcrOptions.

This change allows users to seamlessly work with PaddleOCR engine which provides higher accuracy and performance in use-cases which require working with complex PDF files.

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the
    conventional commits.
  • Documentation has been updated, if necessary.
  • Examples have been added, if necessary.
  • Tests have been added, if necessary.

Copy link

mergify bot commented Nov 20, 2024

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?:

pyproject.toml Outdated
@@ -95,6 +95,7 @@ torchvision = [

[tool.poetry.extras]
tesserocr = ["tesserocr"]
paddleocr = ["paddlepaddle", "paddleocr"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For adding extras, these should also be in the main dependencies, with the optional=true flag

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2024-11-20 at 7 49 18 PM

Hey, I am getting this numpy version issue however, in my environment I have:

  • numpy == 1.26.4
  • deepsearch-glm == 0.26.1
  • paddlepaddle == 2.6.2
  • paddleocr == 2.9.1

the library seems to be working fine, all the tests are getting passed as well. Can you please help me in resolving this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • which python version?
  • can you please try to rebase with the latest main branch? we just merged something about numpy conflicts as well

Copy link
Contributor Author

@Swaymaw Swaymaw Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the lines I added in pyproject.toml to add the necessary libraries for paddle_ocr:
Screenshot 2024-11-20 at 11 06 42 PM
This is the error I am getting:
Screenshot 2024-11-20 at 11 08 43 PM

I have rebased with the latest main branch however i am still facing the numpy version conflict between paddleocr and deepsearch-glm. For now, I have removed paddlepaddle and paddleocr from the requirement and the extras in pyproject.toml as a simple workaround for now can be to manually install these libraries whenever someone wants to use PaddleOCR engine. We have instructions for the same in installation.md and if someone tries to use it we throw an import error with the instructions to install the library as well.

@Glider95
Copy link

Hello,

Does a RapidOCR implementation could be possible too? (Wrapper of PaddleOCR, a lot easier to install) !

@PeterStaar-IBM
Copy link
Contributor

Hello,

Does a RapidOCR implementation could be possible too? (Wrapper of PaddleOCR, a lot easier to install) !

What is the added delta with RapidOCR compared to PaddleOCR?

Signed-off-by: Swaymaw <[email protected]>
@Swaymaw
Copy link
Contributor Author

Swaymaw commented Nov 22, 2024

Hello,
Does a RapidOCR implementation could be possible too? (Wrapper of PaddleOCR, a lot easier to install) !

What is the added delta with RapidOCR compared to PaddleOCR?

It is just the poetry.lock file nothing much has changed code-wise.

@dolfim-ibm
Copy link
Contributor

@Swaymaw we will check if adding the packages as extras work for us. Meanwhile, can you please make sure to add those manual dependencies in the CI tests?

@ezscode
Copy link

ezscode commented Nov 26, 2024

Hope this merge successfully !

@dolfim-ibm
Copy link
Contributor

Hope this merge successfully !

@ezscode I think this PR will be superseded by #415

@Swaymaw
Copy link
Contributor Author

Swaymaw commented Nov 26, 2024

@dolfim-ibm Should I close this pull request to avoid any confusion?

@dolfim-ibm
Copy link
Contributor

@dolfim-ibm Should I close this pull request to avoid any confusion?

@Swaymaw yes, I'm closing as discussed in #415 .

@dolfim-ibm dolfim-ibm closed this Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants