Skip to content

Commit

Permalink
fix: add proper table provenance
Browse files Browse the repository at this point in the history
Signed-off-by: Christoph Auer <[email protected]>
  • Loading branch information
cau-git committed Feb 26, 2025
1 parent b88440a commit 483d1bf
Showing 1 changed file with 11 additions and 1 deletion.
12 changes: 11 additions & 1 deletion docling/pipeline/vlm_pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -443,7 +443,17 @@ def parse_table_content(otsl_content: str) -> TableData:

if tag_name == DocumentToken.OTSL.value:
table_data = parse_table_content(full_chunk)
doc.add_table(data=table_data)
bbox = extract_bounding_box(full_chunk)

if bbox:
prov = ProvenanceItem(
bbox=bbox.resize_by_scale(pg_width, pg_height),
charspan=(0, 0),
page_no=page_no,
)
doc.add_table(data=table_data, prov=prov)
else:
doc.add_table(data=table_data)

elif tag_name == DocItemLabel.PICTURE:
text_caption_content = extract_inner_text(full_chunk)
Expand Down

0 comments on commit 483d1bf

Please sign in to comment.