Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PPStructure的OCR对layout结果图resize造成像素失真导致识别效果极差 #10322

Closed
lycfight opened this issue Jul 7, 2023 · 1 comment

Comments

@lycfight
Copy link

lycfight commented Jul 7, 2023

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

  • 系统环境/System Environment:
  • 版本号/Version:Paddle: PaddleOCR: 问题相关组件/Related components:
  • 运行指令/Command Code:
  • 完整报错/Complete Error Message:
    PPStructure版面分析结果文本块OCR效果极差,参考10270#
    经过排查和实验验证,定位为PaddleOCR会将layout结果的每个小块的图进行放大,推测是由于直接resize缩放操作导致像素失真,而底层的OCR模型均不具备对像素缩放的鲁棒性。希望在后续版本能够解决这个问题,要么在OCR的底层模型的训练加入像素失真的hard case;要么在OCR的前置预处理避免对较小图像素失真的操作,改而在周围填空白的方式将图片尺寸标准化;要么优化pipeline,先对整页进行OCR,然后在将OCR结果放到版面分析结果的块里
@GreatV
Copy link
Collaborator

GreatV commented Apr 24, 2024

Hi @lycfight, 这个问题应该已经在 #11916 修复了,可以试试main分支的效果。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants