Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure ocrs --json output to match HierText format #28

Merged
merged 1 commit into from
Jan 2, 2024

Conversation

robertknight
Copy link
Owner

Restructure the JSON output produced by ocrs --json to match the hierarchical format used by HierText. In the process this adds the recognized text and changes the coordinates from axis-aligned bounding box [left, top, bottom, right] coords to a more flexible list of [x, y] coordinates of vertices.

  • Change the structure from two levels ("paragraphs" => "words") to three levels ("paragraphs" => "lines" => "words")

  • Include the recognized text in the output for each line and word in the "text" property.

  • Output coords as vertices of the rotated rect instead of a bounding box.

Restructure the JSON output produced by `ocrs --json` to match the hierarchical
format used by HierText. In the process this adds the recognized text and
changes the coordinates from axis-aligned bounding box `[left, top, bottom,
right]` coords to a more flexible list of `[x, y]` coordinates of vertices.

 - Change the structure from two levels ("paragraphs" => "words") to
   three levels ("paragraphs" => "lines" => "words")

 - Include the recognized text in the output for each line and word in
   the "text" property.

 - Output coords as vertices of the rotated rect instead of a bounding
   box.
@robertknight robertknight merged commit 845a13b into main Jan 2, 2024
1 check passed
@robertknight robertknight deleted the ocrs-json-output branch January 2, 2024 09:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant