-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Option to Link Images Rather Than Embed Them For ODT #9815
Comments
The code that generates the https://github.com/jgm/pandoc/blob/main/src/Text/Pandoc/Writers/OpenDocument.hs#L647 BUT I can't work out how the ODT writer changes this? |
There is already a
Which would trigger the images-as-links. This issue is for ODT as I think it should be a simple change, but DOCX also supports images-as-links (with more complex OpenXML changes needed)... |
I think it might be confusing if |
Do I understand correctly, from the message from @iandol above, that OpenDocument is actually written with embedding set to false, and ODT and DOCX with it set to true? Paolo |
Right, if used this would need to be clearly documented. The alternative is a new command-line option which will probably sound similar (
I don't think this is explicitly controlled. The opendocument writer uses links (technically embedding false, but not some sort of global switch), and this is somehow changed for ODT and DOCX always embeds. Having looked at the writers at least for me who knows no Haskell, I couldn't see an easy implementation. |
I will add a reason for implementing this feature (linked/embedded images): while embedding may produce easier-to-handle ODT or DOCX files, it would also prevent them from being used as an intermediate file format for going from Markdown to a page layout program. Programs like InDesign, Affinity Publisher, QuarkXPress, Scribus, can all import the RTF or DOCX file format. They are unfortunately unable to import Markdown. Apparently, there is no way to make Markdown compatibility a priority. Hence the importance of Pandoc in the process. If image links and names were preserved in the translation, the original aim of the Markdown project would also be preserved. A page layout program as a last step before generating a PDF or ePub file is very useful, since many details can be finely adjusted in a way that isn't when programming the output thinking to LaTeX or Typst. Pandoc would allow a smooth integration between the ease of authoring offered by Markdown, and the fine control on details and high-quality typographic output offered by page layout programs. |
Incidentally: I'm completing a PDF project started in Markdown, and completed in Word for the impossibility to finish it in a page layout program, due to the lack of a reasonable way to go from Markdown to a page layout program. I hate Word, I hate the world! Nobody dare talking to me today! |
It would be easy enough to modify the ODT writer to optionally skip the step that embeds the images. ( The difficulty is figuring out what should trigger this. As I mentioned, it would be weird to make One could add another option I suppose. |
How about |
I think |
Right, ODT appears straightforward as usual, and LibreOffice can convert to DOCX for anyone who needs DOCX. While i think DOCX is low priority, out of curiosity I generated a minimal DOCX with a linked image to demonstrate the desired output. In <w:r w:rsidR="00E10F1E">
<w:rPr>
<w:noProof/>
</w:rPr>
<w:drawing>
<wp:inline distT="0" distB="0" distL="0" distR="0">
<wp:extent cx="1270000" cy="419100"/>
<wp:effectExtent l="0" t="0" r="0" b="0"/>
<wp:docPr id="744031760" name="placeholder.png"/>
<wp:cNvGraphicFramePr>
<a:graphicFrameLocks xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" noChangeAspect="1"/>
</wp:cNvGraphicFramePr>
<a:graphic xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:pic xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:nvPicPr>
<pic:cNvPr id="744031760" name="placeholder.png"/>
<pic:cNvPicPr/>
</pic:nvPicPr>
<pic:blipFill>
<a:blip r:link="rId4"/>
<a:stretch>
<a:fillRect/>
</a:stretch>
</pic:blipFill>
<pic:spPr>
<a:xfrm>
<a:off x="0" y="0"/>
<a:ext cx="1270000" cy="419100"/>
</a:xfrm>
<a:prstGeom prst="rect">
<a:avLst/>
</a:prstGeom>
</pic:spPr>
</pic:pic>
</a:graphicData>
</a:graphic>
</wp:inline>
</w:drawing>
</w:r> The link to disk is stored in <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/webSettings" Target="webSettings.xml"/>
<Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/settings" Target="settings.xml"/>
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml"/>
<Relationship Id="rId6" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/theme" Target="theme/theme1.xml"/>
<Relationship Id="rId5" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/fontTable" Target="fontTable.xml"/>
<Relationship Id="rId4" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="file:////Users/ian/placeholder.png" TargetMode="External"/>
</Relationships> |
Sorry forgot to add the docx: Note Word uses an absolute path whereas LibreOffice uses a relative path, I will test if a relative path will work if I manually edit the XML... EDIT: using |
Here's a patch that would add
However, it doesn't work (at least, LibreOffice raises an error and does not display the image). Had you actually tested ODTs with the linked images? |
Yes, ODT definitely supports links. Here is a linked doc (same Saved as an ODT and flat FODT. Flanked by "Pre." and "Post." paragraphs: ![]() The GUI shows an absolute path but it is saved slightly differently between the ODT ( ODT: <text:p text:style-name="Standard">
<draw:frame draw:style-name="fr1" draw:name="Image1" text:anchor-type="as-char" svg:width="4.759cm" style:rel-width="28%" svg:height="1.549cm" style:rel-height="scale" draw:z-index="0">
<draw:image xlink:href="../placeholder.png" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad" draw:filter-name="<All images>" draw:mime-type="image/png"/>
</draw:frame>
</text:p> FODT: <text:p text:style-name="Standard">
<draw:frame draw:style-name="fr1" draw:name="Image1" text:anchor-type="as-char" svg:width="4.759cm" style:rel-width="28%" svg:height="1.549cm" style:rel-height="scale" draw:z-index="0">
<draw:image xlink:href="placeholder.png" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad" draw:filter-name="<All images>" draw:mime-type="image/png"/>
</draw:frame>
</text:p> I wonder if there is something else in the document that is required. OK, here I take a Pandoc generated ODT: <office:body>
<office:text>
<text:p text:style-name="Text_20_body">Pre.</text:p>
<text:p text:style-name="Text_20_body">Post.</text:p>
</office:text>
</office:body> And open it, and add a linked image pandoc+link.odt: <office:body>
<office:text>
<text:sequence-decls>
<text:sequence-decl text:display-outline-level="1" text:separation-character="." text:name="Illustration"/>
<text:sequence-decl text:display-outline-level="0" text:name="Table"/>
<text:sequence-decl text:display-outline-level="0" text:name="Text"/>
<text:sequence-decl text:display-outline-level="0" text:name="Drawing"/>
<text:sequence-decl text:display-outline-level="0" text:name="Figure"/>
</text:sequence-decls>
<text:p text:style-name="P2">Pre.</text:p>
<text:p text:style-name="P2">
<draw:frame draw:style-name="fr1" draw:name="Image1" text:anchor-type="as-char" svg:width="16.51cm" style:rel-width="100%" svg:height="5.447cm" style:rel-height="scale" draw:z-index="0">
<draw:image xlink:href="../placeholder.png" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad" draw:filter-name="<All images>" draw:mime-type="image/png"/>
</draw:frame>
</text:p>
<text:p text:style-name="Text_20_body">Post.</text:p>
</office:text>
</office:body>
note: importing a linked image in LO wraps it into a caption box and floats it; I am manually removing the caption box and unfloating the image (making it inline) to try to simplify the testcase. I will need to test a captioned image later on... |
Here's another comparison. I generated an ODT with image with Pandoc: <office:body>
<office:text>
<text:p text:style-name="Text_20_body">Pre.</text:p>
<text:p text:style-name="Text_20_body">
<draw:frame draw:name="img1" svg:width="200.0pt" svg:height="66.0pt">
<draw:image xlink:href="Pictures/0.png" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad" />
</draw:frame>
</text:p>
<text:p text:style-name="Text_20_body">Post.</text:p>
</office:text>
</office:body> Duplicated it and then converted the image to a link (see screenshot above you can add a link filename which turns and embedd into a linked image): <office:body>
<office:text>
<text:sequence-decls><text:sequence-decl text:display-outline-level="0" text:name="Illustration"/><text:sequence-decl text:display-outline-level="0" text:name="Table"/><text:sequence-decl text:display-outline-level="0" text:name="Text"/><text:sequence-decl text:display-outline-level="0" text:name="Drawing"/><text:sequence-decl text:display-outline-level="0" text:name="Figure"/></text:sequence-decls>
<text:p text:style-name="Text_20_body">Pre.</text:p>
<text:p text:style-name="Text_20_body">
<draw:frame draw:style-name="fr1" draw:name="img1" text:anchor-type="as-char" svg:width="7.056cm" svg:height="2.328cm" draw:z-index="0">
<draw:image xlink:href="../placeholder.png" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad" draw:filter-name="<All images>" draw:mime-type="image/png"/></draw:frame>
</text:p>
<text:p text:style-name="Text_20_body">Post.</text:p>
</office:text>
</office:body> In the Pandoc untouched ODT is a META-INF/metadata.xml that does point to the Pictures/o.png image insode the ODT: <?xml version="1.0" encoding="utf-8"?>
<manifest:manifest xmlns:manifest="urn:oasis:names:tc:opendocument:xmlns:manifest:1.0" manifest:version="1.3">
<manifest:file-entry manifest:media-type="application/vnd.oasis.opendocument.text" manifest:full-path="/" manifest:version="1.3" />
<manifest:file-entry manifest:media-type="application/xml" manifest:full-path="content.xml" />
<manifest:file-entry manifest:media-type="image/png" manifest:full-path="Pictures/0.png" />
<manifest:file-entry manifest:media-type="application/rdf+xml" manifest:full-path="manifest.rdf" />
<manifest:file-entry manifest:media-type="application/xml" manifest:full-path="styles.xml" />
<manifest:file-entry manifest:media-type="application/xml" manifest:full-path="meta.xml" />
</manifest:manifest> If you can upload a non-working ODT I can have a better look. |
OK, that was what I was missing: we have to put |
New cli option: `--link-images`. New field in WriterOptions: `writerLinkImages` [API change]. New field in Opts: `optLinkImages` [API change]. See #9815.
New cli option: `--link-images`. This causes images to be linked rather than embedded in ODT. New field in WriterOptions: `writerLinkImages` [API change]. New field in Opt: `optLinkImages` [API change]. Closes #9815.
New cli option: `--link-images`. This causes images to be linked rather than embedded in ODT. New field in WriterOptions: `writerLinkImages` [API change]. New field in Opt: `optLinkImages` [API change]. Closes #9815.
I tried Pandoc 3.2.1 on my Intel Mac, and indeed the image path and name is included in the DOCX file. I did my conversion though Quarto 1.6.1. However, Word for Mac continues to embed an image with its own RTF name. When imported into InDesign or Affinity Publisher, only the embedded image is considered. I don't know if this is still something that can be solved on the Pandoc side, or it is something inside Word or the page layout programs that are importing it. |
@ptram this patch only affects ODT, not DOCX. |
Also this is only testable in a nightly build: https://github.com/jgm/pandoc/actions/workflows/nightly.yml for example: https://github.com/jgm/pandoc/actions/runs/9790558535/artifacts/1667189311 -- it hasn't made it to a release yet. This is a simple test with an image called Then saved with LO as DOCX: At least saved from LO the DOCX ZIP does not embed the image and shows the image is external. <Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="file:///Users/ian/placeholder.png" TargetMode="External"/> Word treats it as an absolute path, but I assume it will adjust based on the loading location? But how that imports I can't test. LibreOffice is better at a bunch of stuff and often makes a better intermediate than Word itself. |
@ptram So, the issue is that you are using |
Describe your proposed improvement and the problem it solves.
For many formatting workflows, editors or publishers prefer not to embed figures. ODT allows you to easily embed or link images, and in fact the
opendocument
writer already supports linking:BUT
odt
forces the image to be embedded, so the same markdown becomes something like:It would be great if there was a command-line option to allow to link to images (i.e. preserve the opendocument way for odt). This way we could generate ODT files with figures that were linked. The same technically applies to DOCX (Word does allow linking, but of course the syntax is much more complex).
Describe alternatives you've considered.
I imagine a Lua filter could do this, and I suspect it is a viable workaround?
The text was updated successfully, but these errors were encountered: