New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

PPStructure的OCR对layout结果图resize造成像素失真导致识别效果极差 #10322

Closed

lycfight opened this issue Jul 7, 2023 · 1 comment

Assignees

Labels

feature request status/close

lycfight commented Jul 7, 2023 •

edited

Loading

请提供下述完整信息以便快速定位问题/Please provide the following information to quickly locate the problem

系统环境/System Environment：
版本号/Version：Paddle： PaddleOCR：问题相关组件/Related components：
运行指令/Command Code：
完整报错/Complete Error Message：
PPStructure版面分析结果文本块OCR效果极差，参考10270#
经过排查和实验验证，定位为PaddleOCR会将layout结果的每个小块的图进行放大，推测是由于直接resize缩放操作导致像素失真，而底层的OCR模型均不具备对像素缩放的鲁棒性。希望在后续版本能够解决这个问题，要么在OCR的底层模型的训练加入像素失真的hard case；要么在OCR的前置预处理避免对较小图像素失真的操作，改而在周围填空白的方式将图片尺寸标准化；要么优化pipeline，先对整页进行OCR，然后在将OCR结果放到版面分析结果的块里

paddle-bot bot assigned tink2123

shiyutang added the feature request label

RussellLuo mentioned this issue

Enhance the OCR recognition accuracy of PPStructure. #11916

Merged

Collaborator

GreatV commented Apr 24, 2024

Hi @lycfight, 这个问题应该已经在 #11916 修复了，可以试试main分支的效果。

GreatV closed this as completed

paddle-bot bot added the status/close label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment