-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Code and equation model for PDF and code blocks in markdown #752
Conversation
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
🟢 Require two reviewer for test updatesWonderful, this rule succeeded.When test data is updated, we require two reviewers
|
156d38b
to
aa221c7
Compare
Signed-off-by: Matteo Omenetti <[email protected]>
Signed-off-by: Matteo Omenetti <[email protected]>
Signed-off-by: Matteo Omenetti <[email protected]>
Signed-off-by: Christoph Auer <[email protected]>
fe04026
to
bfccc6e
Compare
Signed-off-by: Michele Dolfi <[email protected]>
Signed-off-by: Michele Dolfi <[email protected]>
Signed-off-by: Michele Dolfi <[email protected]>
Signed-off-by: Michele Dolfi <[email protected]>
Signed-off-by: Michele Dolfi <[email protected]>
Signed-off-by: Matteo Omenetti <[email protected]>
Signed-off-by: Matteo Omenetti <[email protected]>
Signed-off-by: Matteo Omenetti <[email protected]>
Signed-off-by: Matteo Omenetti <[email protected]>
Co-authored-by: Michele Dolfi <[email protected]> Signed-off-by: Matteo <[email protected]>
Signed-off-by: Matteo Omenetti <[email protected]>
…ocling into mao1/code_equation_model
Signed-off-by: Michele Dolfi <[email protected]>
Signed-off-by: Michele Dolfi <[email protected]>
Signed-off-by: Michele Dolfi <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…4SD#752) * propagated changes for new CodeItem class Signed-off-by: Matteo Omenetti <[email protected]> * Rebased branch on latest main. changes for CodeItem Signed-off-by: Matteo Omenetti <[email protected]> * removed unused files Signed-off-by: Matteo Omenetti <[email protected]> * chore: update lockfile Signed-off-by: Christoph Auer <[email protected]> * pin latest docling-core Signed-off-by: Michele Dolfi <[email protected]> * update docling-core pinning Signed-off-by: Michele Dolfi <[email protected]> * pin docling-core Signed-off-by: Michele Dolfi <[email protected]> * use new add_code in backends and update typing in MD backend Signed-off-by: Michele Dolfi <[email protected]> * added if statement for backend Signed-off-by: Matteo Omenetti <[email protected]> * removed unused import Signed-off-by: Matteo Omenetti <[email protected]> * removed print statements Signed-off-by: Matteo Omenetti <[email protected]> * gt for new pdf Signed-off-by: Matteo Omenetti <[email protected]> * Update docling/pipeline/standard_pdf_pipeline.py Co-authored-by: Michele Dolfi <[email protected]> Signed-off-by: Matteo <[email protected]> * fixed doc comment of __call__ function of code_formula_model Signed-off-by: Matteo Omenetti <[email protected]> * fix artifacts_path type Signed-off-by: Michele Dolfi <[email protected]> * move imports Signed-off-by: Michele Dolfi <[email protected]> * move expansion_factor to base class Signed-off-by: Michele Dolfi <[email protected]> --------- Signed-off-by: Matteo Omenetti <[email protected]> Signed-off-by: Christoph Auer <[email protected]> Signed-off-by: Michele Dolfi <[email protected]> Signed-off-by: Matteo <[email protected]> Co-authored-by: Christoph Auer <[email protected]> Co-authored-by: Michele Dolfi <[email protected]> Co-authored-by: Michele Dolfi <[email protected]> Signed-off-by: Václav Vančura <[email protected]>
Thanks for your awesome work! Could you please provide an example/detailed documentation of how to use this feature? |
Hello @ShayanTalaei from docling.datamodel.pipeline_options import PdfPipelineOptions
pipeline_options = PdfPipelineOptions()
pipeline_options.generate_page_images = True
pipeline_options.do_code_enrichment = True
pipeline_options.do_formula_enrichment = True |
add_code()
method in the markdown backend (with typing fixes)