You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
j=call_textract(input_document=f"{awspath}/images/{newfile}_table.jpeg", features=[Textract_Features.TABLES])
# the t_doc will be not orderedt_doc=TDocumentSchema().load(j)
# the ordered_doc has elements ordered by y-coordinate (top to bottom of page)ordered_doc=order_blocks_by_geo(t_doc)
# send to trp for further processing logictrp_doc=trp.Document(TDocumentSchema().dump(ordered_doc))
And get the following error:
File "/var/folders/6t/kcngxw3s50z4zg416dhcckjc0000gn/T/ipykernel_59247/1703830676.py", line 1, in <cell line: 1>
t_doc = TDocumentSchema().load(j)
File "/Users/olivergiesecke/opt/anaconda3/envs/labelstudioenv/lib/python3.9/site-packages/marshmallow/schema.py", line 719, in load
return self._do_load(
File "/Users/olivergiesecke/opt/anaconda3/envs/labelstudioenv/lib/python3.9/site-packages/marshmallow/schema.py", line 892, in _do_load
result = self._invoke_load_processors(
File "/Users/olivergiesecke/opt/anaconda3/envs/labelstudioenv/lib/python3.9/site-packages/marshmallow/schema.py", line 1090, in _invoke_load_processors
data = self._invoke_processors(
File "/Users/olivergiesecke/opt/anaconda3/envs/labelstudioenv/lib/python3.9/site-packages/marshmallow/schema.py", line 1220, in _invoke_processors
data = processor(data, many=many, **kwargs)
File "/Users/olivergiesecke/opt/anaconda3/envs/labelstudioenv/lib/python3.9/site-packages/trp/trp2.py", line 848, in make_tdocument
return TDocument(**data)
File "<string>", line 14, in __init__
File "/Users/olivergiesecke/opt/anaconda3/envs/labelstudioenv/lib/python3.9/site-packages/trp/trp2.py", line 468, in __post_init__
self._block_id_maps[blk.block_type][blk.id] = blk_i
KeyError: 'TABLE_TITLE'
The text was updated successfully, but these errors were encountered:
For the immediate error, It appears that _block_id_maps only gets initialized for the block types that are present in the document, which I believe is a bug because block_id_map(block_type) & block_map(block_type) are documented/exposed functions.
However, I suspect initialising the map alone won't solve the issue, because there must be some reason the loader is searching for a TABLE_TITLE block when the TDocument state hasn't seen any.
Would you be able to share a non-confidential document/image that reproduces this issue?
_block_id_maps initialization was addressed in the linked PR and now released on PyPI v1.0.3.
I appreciate this issue was originally reported quite some time ago - If anybody's able to share a document that can reproduce it (or even better to test on v1.0.3+ and confirm whether it's helped) we can dive deeper. Otherwise, we'll probably close it out.
I run:
And get the following error:
The text was updated successfully, but these errors were encountered: