Question on post-processing table structure with text bounding boxes #61

RobAcc22 · 2022-07-28T16:50:41Z

Hello,
I am working with the table structure detection model, using it over table images. I extract the structure and the text, using CRAFT for the detection of the text bounding boxes and the table-transformer model for the table structure. To post-process the table structure prediction I use the text bounding boxes with the postprocess functions.

I encounter the following problem when following this approach. For some table images in which the text in a cell is a single character, CRAFT commonly detects those individual characters as together, producing large text bounding boxes like in the image below (second column).

The issue is when I use these bounding boxes, some of the predicted rows are enlarged so as they contain this large OCR bounding boxes. In the image below you see the raw predicted rows, without any postprocessing.

As you can see the predicted rows are accurate. But when I take the predicted table structure and put it together with the OCR bounding boxes, using the postprocess module and the function objects_to_cells, the rows transform to this:

I hope it is visible that there is a green dotted row that goes from B to H characters, including exactly the text bounding box. I have been looking at this problem and it seems to be produced in the table_structure_to_cells function, in lines 810-844 of postprocess module.

I was wondering if you could suggest of a way to improve the postprocessing operations so this does not occur. Maybe adding a further step of postprocessing or modifying those lines of code. Or if you know of an algorithm that works better than CRAFT to detect text I am also interested.

Many thanks in advance

.

The text was updated successfully, but these errors were encountered:

bsmock · 2022-07-28T18:17:33Z

First of all, congrats on integrating OCR with the model code. This looks very well done and we hope it inspires others to do the same!

As far as your problem with the OCR is concerned, I don't see any easy way to overcome it using post-processing. If OCR does not give you a bounding box for B and C separately, you have no way to split that large text bounding box and know where B is and where C is within the box. So then you have no way to slot B and C into their correct cells using the model output.

One thing you could do is tell the post-processing code to ignore the word bounding boxes and keep its cell bounding boxes as-is. Then you could crop your input image at each cell bounding box and pass each individually to an OCR function to get the text of each cell. It sounds like a painful solution to me but could get the job done.

In my view, the best solution would be to get better OCR. Your case is a tricky one, it's easy to understand why the OCR naively thinks vertical characters stacked over each other would go together as a word.

You could try PyTesseract as an open source solution. I've also been very impressed with OCR from Azure Cognitive Services. I suggest giving these a try.

Cheers,
Brandon

RobAcc22 · 2022-08-01T12:52:13Z

Thanks for the answer.

The thing is the post-processing is quite useful in some other cases, so I'd prefer keeping this step. I will try to find a way to improve OCR bounding boxes.

Cheers,
Roberto

zackwylde-cmd · 2022-09-20T09:43:56Z

hello,
@RobAcc22 can u share the inference code for TSR
thanks in advance.

bsmock added the question Further information is requested label Jul 28, 2022

lionely mentioned this issue Jun 26, 2023

Is this the correct way to generate tokens for a new example? #121

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question on post-processing table structure with text bounding boxes #61

Question on post-processing table structure with text bounding boxes #61

RobAcc22 commented Jul 28, 2022

bsmock commented Jul 28, 2022 •

edited

Loading

RobAcc22 commented Aug 1, 2022 •

edited

Loading

zackwylde-cmd commented Sep 20, 2022

Question on post-processing table structure with text bounding boxes #61

Question on post-processing table structure with text bounding boxes #61

Comments

RobAcc22 commented Jul 28, 2022

bsmock commented Jul 28, 2022 • edited Loading

RobAcc22 commented Aug 1, 2022 • edited Loading

zackwylde-cmd commented Sep 20, 2022

bsmock commented Jul 28, 2022 •

edited

Loading

RobAcc22 commented Aug 1, 2022 •

edited

Loading