Releases: jgm/pandoc
pandoc 3.0
Click to expand changelog
-
Split pandoc-server, pandoc-cli, and pandoc-lua-engine into separate packages (#8309). Note that installing the
pandoc
package from Hackage will no longer give you thepandoc
executable; for that you need to installpandoc-cli
. -
Pandoc now behaves like a Lua interpreter when called as
pandoc-lua
or whenpandoc lua
is used (#8311, Albert Krewinkel). The Lua API that is available in filters is automatically available to the interpreter. (See thepandoc-lua
man page.) -
Pandoc behaves like a server when called as
pandoc-server
or whenpandoc server
is used. (See thepandoc-server
man page.) -
A new command-line option
--list-tables
, causes tables to be formatted as list tables in RST (#4564, with Francesco Occhipinti). -
New command line option:
--epub-title-page=true|false
allows the EPUB title page to be omitted (#6097). -
--reference-doc
can now accept a URL argument (#8535) and load a remote reference doc. -
--version
output no longer contains version info for dependent packages. Instead, it contains a “Features” line that indicates whether the binary was compiled with support for acting as a server, and for using Lua filters and Custom writers. -
A new option
--split-level
replaces--epub-chapter-level
and affects both EPUB and chunked HTML output.--epub-chapter-level
will still work but is deprecated. -
Multiple input files with
--file-scope
: fix case where the links are URL-encoded, e.g. with%20
(#8467). -
Produce error if
--csl
is used more than once (#8195, Prat). -
Remove deprecated
--atx-headers
option. -
Remove deprecated option
--strip-empty-paragraphs
. -
In
--verbose
mode add message when running citeproc (as with other filters). -
Add new
mark
extension for highlighted text in Markdown, using==
delimiters (#7743). -
Add new extensions
wikilinks_title_after_pipe
andwikilinks_title_before_pipe
forcommonmark
andmarkdown
. (#2923, Albert Krewinkel). The former enables links of style[[Name of page|Title]]
and the latter[[Title|Name of page]]
. Titles are optional in both variants, so this works for both:[[https://example.org]]
,[[Name of page]]
. The writer is modified to render links with titlewikilink
as a wikilink if a respective extension is enabled. Pandoc will usewikilinks_title_after_pipe
if both extensions are enabled. -
Add prefixes to identifiers with
--file-scope
(#6384). This change only affects the case where--file-scope
is used and more than one file is specified on the command line. In this case, identifiers will be prefixed with a string derived from the file path, to disambiguate them. For example, an identifierfoo
incontents/file1.txt
will becomecontents__file1.txt__foo
. Links will be adjusted accordingly: iffile2.txt
links tofile1.txt#foo
, then the link will be changed to point to#file1.txt__foo
. Similarly, a link tofile1.txt
will point to#file1.txt
. A Div with an identifier derived from the file path will be added around each file’s content, so that links to files will still work. -
New output format:
chunkedhtml
. This creates a zip file containing multiple HTML files, one for each section, linked with “next,” “previous,” “up,” and “top” links. (If-o
is used with an argument without an extension, it is treated as a directory and the zip file is automatically extracted there, unless it already exists.) The top page will contain a table of contents if--toc
is used. Asitemap.json
file is also included. The option--split-level
determines the level at which sections are to be split. -
Support complex figures (Albert Krewinkel, Aner Lucero). There is now a dedicate Figure block constructor for figures. The old hack of representing a figure as
Para [Image attr [..alt..] (source, "fig:title")]
has been dropped. Here is a summary of figure support in different formats:- Markdown reader: paragraphs containing just an image are treated as figures if the
implicit_figures
extension is enabled. The identifier is used as the figure’s identifier and the image description is also used as figure caption; all other attributes are treated as belonging to the image. - Markdown writer: figures are output as implicit figures if possible, via HTML if the
raw_html
extension is enabled, and as Div elements otherwise. - HTML reader:
<figure>
elements are parsed as figures, with the caption taken from the respective<figcaption>
elements. - HTML writer: the alt text is no longer constructed from the caption, as was the case with implicit figures. This reduces duplication, but comes at the risk of images that are missing alt texts. Authors should take care to provide alt texts for all images. Some readers, most notably the Markdown reader with the
implicit_figures
extension, add a caption that’s identical to the image description. The writer checks for this and adds anaria-hidden
attribute to the<figcaption>
element in that case. - JATS reader: The
<fig>
and<caption>
elements are parsed into figure elements, even if the contents is more complex. - JATS writer: The
<fig>
and<caption>
elements are used write figures. - LaTeX reader: support for figures with non-image contents and for subfigures.
- LaTeX writer: complex figures, e.g. with non-image contents and subfigures, are supported. The
subfigure
template variable is set if the document contains subfigures, triggering the conditional loading of the subcaption package. Contants of figures that contain tables are become unwrapped, as longtable environments are not allowed within figures. - DokuWiki, Haddock, Jira, Man, MediaWiki, Ms, Muse, PPTX, RTF, TEI, ZimWiki writers: Figures are rendered like Div elements.
- Asciidoc writer: The figure contents is unwrapped; each image in the the figure becomes a separate figure.
- Classic custom writers: Figures are passed to the global function
Figure(caption, contents, attr)
, wherecaption
andcontents
are strings andattr
is a table of key-value pairs. - ConTeXt writer: Figures are wrapped in a “placefigure” environment with
\startplacefigure
/\endplacefigure
, adding the features caption and listing title as properties. Subfigures are place in a single row with the\startfloatcombination
environment. - DocBook writer: Uses
mediaobject
elements, unless the figure contains subfigures or tables, in which case the figure content is unwrapped. - Docx writer: figures with multiple content blocks are rendered as tables with style
FigureTable
; like before, single-image figures are still output as paragraphs with styleFigure
orCaptioned Figure
, depending on whether a caption is attached. - DokuWiki writer: Caption and “alt-text” are no longer combined. The alt text of a figure will now be lost in the conversion.
- FB2 writer: The figure caption is added as alt text to the images in the figure; pre-existing alt texts are kept.
- ICML writer: Only single-image figures are supported. The contents of figures with additional elements gets unwrapped.
- OpenDocument writer: A separate paragraph is generated for each block element in a figure, each with style
FigureWithCaption
. Behavior for single-image figures therefore remains unchanged. - Org writer: Only the first element in a figure is given a caption; additional block elements in the figure are appended without any caption being added.
- RST writer: Single-image figures are supported as before; the contents of more complex images become nested in a container of type
float
. - Texinfo writer: Figures are rendered as float with type
figure
. - Textile writer: Figures are rendered with the help of HTML elements.
- XWiki: Figures are placed in a group.
- Markdown reader: paragraphs containing just an image are treated as figures if the
-
Changes in custom readers/writers:
- It is now possible to have a custom reader and a custom writer for a format together in the same file. The file may also define a custom template for the writer.
- Pandoc now checks the folder
custom
in the user’s data directory for a matching script if it can’t find one in the local directory. Previously, thereaders
andwriters
data directories were searched for custom readers and writers, respectively. Scripts in those directories must be moved to thecustom
folder. - Custom readers used to implement a fallback behavior that allowed to consume just a string value as input to the
Reader
function. This has been removed, the first argument is now always a list of sources. Usetostring
on that argument to get a string.
-
New module Text.Pandoc.Writers.ChunkedHTML, exporting
writeChunkedHtml
[API change]. -
We now set the
pandoc-version
variable centrally rather than in the writers. One effect is the man writer now emits a comment with the pandoc version. -
pandoc-server:
- Add simple CORS support to pandoc-server (#8427).
- Print message to stderr when starting the server.
-
Docx reader:
-
ODT reader:
-
DocBook reader:
-
JATS reader:
- Handle uri element in references (#8270).
-
Ipynb reader:
...
pandoc 2.19.2
Click to expand changelog
-
Fix regression with data uris in 2.19.1 (#8239). In 2.19.1 we used the base64URL encoding rather than base64.
-
pandoc-server: handle
citeproc
parameter as documented (#8235). -
Org reader: treat emacs-jupyter src blocks as code cells (#8236, Albert Krewinkel). This improves support for notebook-like org files that are intended to be used with emacs-jupyter package.
-
HTML writer and templates: revert to using
width
property for column widths (Albert Krewinkel). The defaultflex
andoverflow-x
properties of a column are set toauto
. In combination, these changes allow to get good results when using columns with or without explicit widths. -
Org writer (Albert Krewinkel):
- Add support for jupyter nodebook cells (#6367).
- Prefix code language of ipynb code blocks with
jupyter-
. This is the convention used by the emacs-jupyter package. - Keep code block attributes as header args. This allows to keep more information in the resulting
src
blocks, making it easier to roundtrip from or through Org. Org babel ignores unknown header arguments. - Add code block identifier as
#+name
to src blocks.
-
Fix some typos in the codebase (luz paz).
-
Require hslua-module-path 1.0.3 (#8228, Albert Krewinkel).
pandoc 2.19.1
Click to expand changelog
-
Add server capabilities.
- New exported module Text.Pandoc.Server [API change].
- The pandoc executable now starts up a web server when renamed or symlinked as
pandoc-server
, and functions as a CGI program when renamed or symlinked aspandoc-server.cgi
. See the man page forpandoc-server
for full documentation.
-
Text.Pandoc.App.Opts: Redo
FromJSON
forOpt
so that optional values can be omitted (in which case the values fromdefaultOptions
are used). -
Org reader: treat “abstract” block as metadata (Albert Krewinkel, #8204). A block of type “abstract” is assumed to define the document’s abstract. It is transferred from the main text to the metadata.
-
Org template: add abstract from metadata as block of type “abstract” (#8204).
-
HTML writer: use
flex
property for column widths (Albert Krewinkel, #8232). -
LaTeX writer:
-
LaTeX template: fix behavior of
colorlinks
variable (Albert Krewinkel, #8226). Fixes a regression in 2.19 that required theboxlinks
variable to be set in addition to the usual link coloring variables. Otherwise links were never colored in LaTeX PDF output. -
Text.Pandoc.Highlighting: Export
lookupHighlightingStyle
[API change]. Previously this lived in an unexported module Text.Pandoc.App.CommandLineOptions, under the namelookupHighlightStyle
. -
Text.Pandoc.App:
- Remove unneeded MonadIO constraints in readSources.
- Factor out
convertWithOpts'
fromconvertWithOpts
. This runs in any PandocMonad, MonadIO, MonadMask instance. So far it is not exported, but it might find a use later.
-
Support
--strip-comments
in commonmark/gfm (#8222). This change makes the commonmark reader sensitive toreaderStripComments
. -
Lua: add function
pandoc.utils.citeproc
(Albert Krewinkel). The function runs the citeproc processor on a Pandoc document. Exposing this functionality to Lua allows to make citation processing part of a filter or writer, simplifies the creation of multiple bibliographies, and enables the use of varying citation styles in different parts of a document. -
Refactor
linux/make_artifacts.sh
. -
Update INSTALL.md installation from source instructions.
-
Use base64 package instead of base64-bytestring. It is supposed to be faster and more standards-compliant.
-
trypandoc improvements:
- Add dropdown with canned examples.
- Add citeproc support.
- Support csv, bibliographic and binary formats.
- Add load from file.
- Add permalink. Don’t always reload page.
- Use vanilla JS and CSS + the new
pandoc-server.cgi
.
-
Allow haddock-library-1.11.0.
-
Convert
tool/extract-changes.hs
to a Lua filter.
pandoc 2.19
Click to expand changelog
-
Add
--embed-resources
flag (Elliot Bobrow, #7331). This can be used to embed resources without implying--standalone
. Deprecate--self-contained
in favor of--embed-resources --standalone
. -
Allow environment variable interpolation in
highlight-style
andpdf-engine
fields in defaults files (#8061; Jaehwang Jung, #8073). -
Allow placing custom readers and writers in user data directory (Albert Krewinkel, #8112) (
readers
andwriters
subdirectories). -
Add
tsv
(tab separated values) as an input format (#7974). [API change]: Text.Pandoc.Readers.CSV now exportsreadTSV
. Internal change: In Text.Pandoc.CSV,CSVOptions
has changed so thatcsvQuote
takes a Maybe value. -
Add
tex_math_dollars
togfm
default extensions (reflecting gfm’s new support for math). -
RST, Org, Markdown readers: support rowspans and colspans in grid tables (#8202, Albert Krewinkel). Note: the writers does not yet support these more complex grid table features, so these complex grid tables will not round-trip.
-
HTML, LaTeX, and MediaWiki readers: use
formatCode
(#8162, #8129, Elliot Bobrow). This moves formatting from inside inline code elements to the outside, since pandoc’s Code element only takes string content. -
Markdown reader:
-
HTML reader:
- Allow sublists that are not marked as items (Albert Krewinkel, #8150). This is technically invalid HTML, but it can be found in the wild and browsers handle it.
-
Org reader (Albert Krewinkel):
- Recognize absolute paths on Windows (Albert Krewinkel, #8201).
- Recognize {webp,jxl} files as images (YI).
- Allow attrs for Org tables (Albert Krewinkel, #8049). Tables with attributes are no longer wrapped in Div elements; attributes are added directly to the table element.
- Support line selection in INCLUDE directives (Brian Leung, #8060).
- Fix Post / Pre mixup when setting emphasis chars (Amir Dekel, #8134).
-
LaTeX reader:
- Support
\includesvg
(#8027). - Unescape characters in
\lstinline
inside\passthrough
(#8179). - Improve
mathEnvWith
(#8122). When converting e.g. an align environment to an aligned environment inside a Math element, we need to include a newline before the\end{aligned}
, since the previous line might end in a comment. - Fix treatment of extensions for
\input
in LaTeX reader (#8092). Previously we required a.tex
extension, but TeX allows any extension for\input
(as opposed to\include
).
- Support
-
RTF reader:
- support
\nosupersub
(#8170).
- support
-
TikiWiki reader:
- Support underlined text
-
DocBook reader:
- Improved reading
<xref>
elements (Frerich Raabe, #8065).
- Improved reading
-
JATS reader:
-
RIS reader:
-
MediaWiki reader:
- Allow HTML comment after row start (#8110).
-
DokuWiki reader:
- The
tex_math_dollars
extension is now supported fordokuwiki
(but off by default) (#8178). - Content inside
<latex>...</latex>
is parsed as raw LaTeX inline, and inside<LATEX>..</LATEX>
as raw LaTeX block (#8178). - The behavior of
<php>...</php>
is changed, so that instead of producing a code block, it produces raw HTML with<?php ... ?>
.
- The
-
LaTeX writer:
- Improve grouping with autocites (#8088).
- Extend list of book documentclasses (Wentau Han, #8053).
- Fix width of multicolumn cells (Albert Krewinkel, #8090). Cells spanning multiple columns must be given an explicit width, calculated from the table properties.
- Beamer: allow containsverbatim as alternative to fragile (#8080).
-
HTML writer:
- Add ‘footnotes’ identifier to footnotes section (#8043).
- Fix bug with
--number-offset
. This formerly caused section divs to be produced, even when--section-divs
was not specified (#8097). - Use CSS flexboxes for columns (Albert Krewinkel). This allows an arbitrary number of columns, while the previous approach assumed exactly two columns.
- Allow “spanlike” classes to be combined (see #8194). Previously classes like “underline” and “marked” had to be the first class in a span in order for the span to be interpreted as a “ul” or “mark” element. This commit allows these special classes to be “stacked,” e.g.
[test]{.mark .underline}
; in addition, the special classes are no longer required to come first in the list of classes. - Avoid doubled style attribute when height and width are added to style because of an image, but the image already has a style attribute (#8047).
- Do not include the deprecated doc-endnote role (#8030). doc-endnote was deprecated in DPUB-ARIA 1.1.
- Remove extra soft break for tasklist (black-desk, #8142). Browser will display the extra newline character between checkbox and text as a space, which make tasklist items cannot be aligned.
-
EPUB writer:
- Allow choice of math method for v3 (#8164). Previously we always used MathML for math in EPUB3, because the spec includes MathML. But this is not widely supported by readers, so it seems better to allow users to choose their math method as they can with EPUB2 or HTML. NOTE: Existing workflows that produce EPUBv3 documents including math will be affected by this change. You must add
--mathml
to your command line if you want to continue producing MathML.
- Allow choice of math method for v3 (#8164). Previously we always used MathML for math in EPUB3, because the spec includes MathML. But this is not widely supported by readers, so it seems better to allow users to choose their math method as they can with EPUB2 or HTML. NOTE: Existing workflows that produce EPUBv3 documents including math will be affected by this change. You must add
-
RST writer:
-
Ms writer:
- Add comment in preamble stating generator.
- Fix roff ms syntax highlighting definitions (#8175, thanks to Branden Robinson).
-
ConTeXt writer:
-
Support complex table structures (Albert Krewinkel, #8116). The following table feature are now supported in ConTeXt:
- colspans,
- rowspans,
- multiple bodies,
- row headers, and
- multi-row table head and foot.
The wrapping
placetable
environment is also given areference
option with the table identifier, enabling referencing of the table from within the document. -
Unify link handling (Albert Krewinkel, #8096). Autolinks, i.e. links with content that’s the same as the linked URL, are now marked with the
\url
command. All other links, both internal and external, are created with the\goto
command, leading to shorter, slightly more idiomatic code. As before, autolinks can still be styled via\setupurl
, other links via\setupinteraction
. -
Use “sectionlevel” environment for headings (Albert Krewinkel, #5539). The document hierarchy is now conveyed using the
\startsectionlevel
/\stopsectionlevel
by default. This makes it easy to include pandoc-generated snippets in documents at arbitrary levels. The more semantic environments “chapter”, “section”, “subsection”, etc. are used if the--top-level-division
command line parameter is set to a non-default value.
-
-
Docx writer:
- Add
w:lang
torPr
for Span and Div with lang attribute, so that Word can know that “Apfel” is not a spelling error (#8026). - Prevent crashing when handling invalid tables (Albert Krewinkel, #8102). Tables with different numbers of cells per row would sometimes crash pandoc. This fix prevents this by cutting off overlong rows.
- Add
-
ICML writer:
- Support custom-style attribute on Table (#8079).
-
AsciiDoc writer:
- Fix commas in link text (#8070). Commas in link text trigger interpretation of attributes. To block this, we replace them with numeric entities.
- Fix underline. We were rendering it as
+++text+++
; this is now changed to[.underline]#text#
. See comment at #8070 (comment).
-
FB2 writer:
- Fix handling of non-section Divs (#8123).
-
Markdown writer:
-
Text.Pandoc.Class:
- Add new function
findFileWithDataFallback
[API Change] (Albert Krewinkel). fillMediaBag
: Keep attributes of original image on Span (Albert Krewinkel, #8099). Images that cannot be fetched are replaced with a Span that contains the image’s description. The span now also retains all original image attributes and inherits all attributes of the image. Furthermore, the classesimage
andplaceholder
are added, and path and title are store in attributesoriginal-image-src
andoriginal-image-title
, respectively.
- Add new function
-
Text.Pandoc.Shared:
makeSections
: don’t make a section for a div with class “fragments” (#8098).- Ensure that Nulls are ignored by
makeSection
and in segmenting slides (#8155). - Add
formatCode
function to Text.Pandoc.Shared [API change] (Elliot Bobrow, #8129). taskListItemToAscii
: handle asciidoctor’s characters (#8011). Asciidoctor uses different unicode characters for task lists; we should recognize them too and be able to convert them to ascii task lists in formats like gfm.- Deprecate
deLink
and mark for later removal.
-
Text.Pandoc.Writers.Shared:
toTableOfContents
: Don’t replace links with empty spans in TOC (#8020).
-
Text.Pandoc.Readers.Metadata:
- Ensure that metadata values w/o trailing newlines are parsed as inlines, ...
pandoc 2.18
Click to expand changelog
-
New input formats:
endnotexml
(EndNote XML bibliography),ris
(RIS bibliography). -
A RIS bibliography file may now be used with
--citeproc
. -
Citeproc: Allow a formatted bibliography to be placed in metadata fields via a Div with class
refs
(#7969, #526). Thus, one can include a metadata field, sayrefs
, whose content is an empty div with idrefs
, and the formatted bibliography will be put into this metadata field. It may then be interpolated into a template using the variablerefs
. -
Ensure that you don’t get PDF output to terminal.
-t pdf
now behaves like-t docx
and gives an error unless the output is redirected. -
--version
now prints hslua version (#7929) and Lua version (#7997, Albert Krewinkel). -
Change
--metadata-file
parsing so that, when the input format is not markdown or a markdown variant, pandoc’s markdown is used (#6832, #7926). When the input format is a markdown variant, the same format is used. Reason for the change: it doesn’t make sense to run the markdown parser with a set of extensions designed for a non-markdown format, and this dramatically limits what people can do in metadata files. -
Trim whitespace from math in
--webtex
(#7892). This fixes problems with –webtex and markdown output, when display math starts or ends with a newline. -
New exported module Text.Pandoc.Readers.EndNote, exporting
readEndNoteXML
,readEndNoteXMLCitation
, andreadEndNoteXMLReferences
. [API change] -
--self-contained
: issue warning rather than failing with an error if a resource can’t be found (#7904). -
New exported module, Text.Pandoc.Readers.RIS, exporting
readRIS
(#7894). -
LaTeX reader:
- Handle subequations as inline math environment (#7883).
- Rudimentary support for
vbox
(#7939). - Support
\today
(#7905). - Handle
\label
and\ref
for footnotes (#7930). - Allow inline groups starting with
\bgroup
(#7953). - Use custom TokStream that keeps track of whether macros are expanded. This allows us to improve performance a bit by avoiding unnecessary runs of the macro expansion code (e.g. from 24 ms to 20 ms on our standard benchmark).
- Further optimizations for inline parsing.
- Better handling of
\usepackage
. If the package is local but causes parse errors, parse everything up to the error and skip the rest. Issue aCouldNotParseIncludeFile
warning indicating that parsing failed at that point. - Text.Pandoc.Readers.LaTeX.Parsing: Monoid and Semigroup instances for TokStream.
-
HTML reader:
-
DocBook reader:
- Handle complete set of entities as specified at https://www.w3.org/2003/entities/2007doc/byalpha.html (#7938).
- Handle abstract in info section (#7747).
- Improve info parsing.
- Simplify metadata parsing code (#7747). Handle abstract as block-level content. Report skipped info elements with
--verbose
. - Handle address and coyright in metadata (#7747).
-
DokuWiki reader:
- Add DokuWiki table alignment (#5202, damon-sava-stanley).
-
RST reader:
-
JATS reader:
- Improve handling of fn-group elements (#6348, Albert Krewinkel). Footnotes in
<fn-group>
elements are collected and re-inserted into the document as proper footnotes in the place where they are referenced. - Handle
pub-date
(#8000). - Support PMID, DOI, issue in citations (#7995).
- Improve refs parsing. Handle
issn
andisbn
; use simpler form for issued date. - Strip ‘ref-’ from ref id in constructing CSL id. This allows better round-tripping, because the JATS writer adds the
ref-
prefix to the citation id to get the ref element’s id.
- Improve handling of fn-group elements (#6348, Albert Krewinkel). Footnotes in
-
Org reader:
-
Allow “:” in property drawer keys (Lucas V. R). Any non-space character is allowed as property drawer key, including “:” itself (so it is not really a delimiter). The real delimiter is a space character, so in a drawer like
:PROPERTIES: ::k:ey:: value :END:
“:k:ey:” is a key with value “value”.
-
Allow comments above property drawer.
-
More flexible LaTeX environments (Lucas V. R).
-
Handle
#+bibliography:
as metadata so that it can work with--citeproc
. -
Parse
#+print_bibliography:
as Div with idrefs
. -
Allow multiple
#+bibliography:
.
-
-
Markdown reader:
-
Docx reader:
- Enable
citations
extension for docx reader (#7840). When enabled, Zotero, Mendeley, and EndNote citations embedded in a docx are parsed as native pandoc citations. (When disabled, the generated citation text and bibliography are passed through as regular text.) The bibliography generated by the plugin is suppressed. Instead, bibliographic data embedded in citation items is added to thereferences
metadata field so that it can be used with--citeproc
.
- Enable
-
Docbook writer:
- Interpret links without contents as cross-references (#7360, Jan Tojnar). Links without text contents are converted to
<xref>
elements. DocBook processors will generate appropriate cross-reference text when presented with an xref element.
- Interpret links without contents as cross-references (#7360, Jan Tojnar). Links without text contents are converted to
-
Docx writer:
- Single numbering ID for examples (#7895, mjfs). This change ensures that example list items all belong to a single number sequence, so that if items are added or deleted in a word processor, the other items will renumber automatically.
- Add bookmark with table id to table (#7989, Nikolai Korobeinikov, #7285). This allows tables with ids to be linked to.
-
Ipynb writer:
- Handle metadata better (#7928). Previously we used the markdown writer to render metadata. This had some undesirable consequences (e.g. en dash expanded to
--
whensmart
enabled), so now we use the plain writer.
- Handle metadata better (#7928). Previously we used the markdown writer to render metadata. This had some undesirable consequences (e.g. en dash expanded to
-
LaTeX writer:
- Avoid extra space before
\CSLRightInline
(#7932). - Add
scrreport
tochaptersClasses
(#6168, ivardb). - Support
page
,trim
,clip
attributes on images (#7181). - Add
()
after booktabs rules (#8001). These commands take optional arguments with () and [], which can lead to problems if the content of the table cell begins with these characters.
- Avoid extra space before
-
RST writer:
- Support all standard metadata (“bibliographic”) fields.
-
HTML writer: performance improvements.
-
Org writer:
- Stop indenting property drawers, quote blocks (#3245, Albert Krewinkel). This follows the current default org-mode behavior.
-
Markdown writer:
- Move table-related code into submodule (Albert Krewinkel).
- Don’t produce redundant header identifier when the
gfm_auto_identifiers
extension is set (#7941). - Update escaping rules for
\
. We now escape\
only ifraw_tex
is enabled or it is followed by a non-alphanumeric.
-
JATS writer:
- Encode author “others” as
<etal/>
(Albert Krewinkel). Citeproc adopted the BibTeX convention to use the author name “others” when there are additional authors that are not named. JATS uses the<etal>
element for this. - Avoid doubled ref-list element (#7990). Previously when generating JATS with the
element_citations
extension enabled, the references were put in a doubly-nested ref-list element (<ref-list><ref-list>...
). - Keep edition info in element citations (#7993, Albert Krewinkel).
- Fix handling of CSL variable ‘page’ (not ‘pages’ as we had before). It should go to ‘lpage’ and ‘rpage’, not ‘page-range’.
- Encode author “others” as
-
EPUB writer: refactor for clarity (#7991, Jonathan Dönszelmann, Ola Wolska, Ivar de Bruin, Jaap de Jong).
-
Custom writer (Albert Krewinkel):
- Support new-style Writer function (Albert Krewinkel). See the documentation for custom writers for details.
- Produce stacktrace if Writer function fails
-
Text.Pandoc.Logging: add
CouldNotParseIncludeFile
constructor forLogMessage
[API change]. -
Text.Pandoc.Shared:
- Put id attributes on TOC entries (#7907, damon-sava-stanley). Naming scheme of id is “toc-” + id of linked to header/section. Effects HTML, Markdown, Powerpoint, and RTF.
- Define
ordNub
as alias fornubOrd
from containers package (#7963, Albert Krewinkel). - Export
ensureValidXmlIdentifiers
. This function changes identifiers that don’t start with letters, and internal links to these identifiers, making them compatible with XML standards. The change is simple: we addid_
to the front. There is potential for duplication if there are alreadyid_...
identifiers defined, but this seems rare enough not to worry too much about.
-
Ensure that valid XML identifiers are used in Docbook, EPUB, FB2, HTML4, S5, Slidy, Slideous, ICML, ODT, TEI writers. Thus, if you convert
[anchor]{#1} and [link to](#1)
,id_1
will be used instead of1
for the identifier. -
Lua (Albert Krewinkel).
- Add module
pandoc.layout
to format and layout text. - Move custom writer code into Lua hierarchy.
- Use pandoc-lua-marshal 0.1.5.
- Allow any type of callable object as argument to List functions
filter
,map
, andfind_if
. These previously required the argument to be of...
- Add module
pandoc 2.17.1.1
Click to expand changelog
-
Fix regression in 2.17.1 which caused problems finding default files in the default user data directory. (Reverts the item “logic bug in
fullDefaultsPath
”, which was misguided.) -
Sample custom writer: use single quotes for strings (#7487, Albert Krewinkel).
pandoc 2.17.1
Click to expand changelog
-
Support
pagedjs-cli
as pdf engine (#7838, Albert Krewinkel). PagedJS is a polyfill and supports the Paged Media standards by the W3C. https://www.pagedjs.org/ -
CommonMark reader: fix source position after YAML metadata (#7863).
-
LaTeX reader:
-
Remove retokenizing in
rawLaTeXParser
. -
Ensure that
\raggedright
doesn’t gobble an argument (#7757). -
Improve
descItem
. For some reason we were skipping arbitrary blocks before\item
. This is now changed to “skip whitespace and comments.” -
Improve handling of
\newif
. Adding a pair of braces around the second argument of\def
prevents LaTeX from an emergency stop on input like the following (#6096).\newif\ifepub \epubtrue \ifepub hi \fi
-
-
Docx reader: Parse both Zotero citation and bibliography as
FieldInfo
(#7840). -
LaTeX writer:
-
Markdown writer: handle explicit column widths with pipe tables (#7847). If a table has explicit column width information and the content extends beyond the
--columns
width, we need to adjust the widths of the pipe separators to encode this width information. -
Docx writer: Separate tables even with RawBlocks between (#7224, Michael Hoffmann). Adjacent docx tables need to be separated by an empty paragraph. If there’s a RawBlock between tables which renders to nothing, be sure to still insert the empty paragraph so that they will not collapse together.
-
Man writer: use custom font V for inline code (#7506). The V font is defined conditionally, so that it renders like CB in output formats that support that, and like B in those that don’t (e.g. the terminal). Aliases also defined for VI, VB, VBI.
-
Asciidoc writer: Support checklists in asciidoctor writer (#7832, Nikolai Korobeinikov, ricnorr). The checklist syntax (similar to
task_list
in markdown) seems to be an asciidoctor-only addition. -
HTML writer:
-
Custom writer: preserve order of element attributes (#7489, Albert Krewinkel). Attribute key-value pairs are marshaled as AttributeList, i.e., as a userdata type that behaves both like a list and a map. This allows to preserve the order of key-value pairs.
-
Switch to hslua-2.1 (Albert Krewinkel). This allows for some code simplification and improves stability.
-
Don’t read files outside of user data directory (Even Brenden). If a file path does not exist relative to the working directory, and it does exist relative to the user data directory, but outside of of the user data directory, do not read it. This applies to
readDataFile
andreadMetadataFile
in PandocMonad and, by extension, any module that uses these by passing them relative paths. -
Text.Pandoc.Class.
makeCanonical
: Correctly handle consecutive “..”s at the beginning of a path (Even Brenden). Prior to this commit,../../file
would evaluate tofile
, when it should be unchanged. -
Search for metadata files in
$DATADIR/metadata
(#7851, Even Brenden). If files specified with--metadata-file
are not found in the working directory, look in$DATADIR/metadata
(#5876). -
Text.Pandoc.Class: export
readMetadataFile
[API change] (#5876). -
Text.Pandoc.Error: export new
PandocCouldNotFindMetadataFileError
constructor forPandocError
[API change] (#5876). -
Avoid putting a frame around speaker notes in beamer (#7857). If speaker notes (a Div with class ‘notes’) occur right after a section heading, but above slide level, the resulting
\note{..}
caommand should not be wrapped in a frame, as that will cause a spurious blank slide. -
CSS in HTML template: adjust #TOC and h1 on mobile (#7835, Mauro Bieg).
-
Text.Pandoc.Readers.LaTeX.Parsing: don’t export
totoks
. Make the first param oftokenize
a SourcePos instead of SourceName, and use it instead oftotoks
. -
Text.Pandoc.Shared: Modify
stringify
so it ignores[Citation]
insideCite
(#7855). Otherwise we’ll sometimes get two copies of things, one from thecitationPrefix
orcitationSuffix
and another from the embedded fallback text. When there is no fallback text, we’ll get no content. However, it really isn’t an alternative to just rely on the result of runningquery
on the embeddedCitation
s; this will result in a jumble of text rather than anything structured. -
Omit
--enable-doc
in the cabal haddock invocation intools/build-and-upload-api-docs.sh
. -
Text.Pandoc.App.Opt: fix logic bug in
fullDefaultsPath
. Previously we would (also) search the default user data directory for a defaults file, even if a different user data directory was specified using--data-dir
. This was a mistake; if--data-dir
is used, the default user data directory should not be searched. -
Text.Pandoc.Shared:
defaultUserDataDir
behavior change (#7842). If the XDG data directory is not defined (e.g. because it’s not supported in the OS or HOME isn’t defined), we return the empty string instead of raising an exception. -
Update command tests to distinguish stderr and test exit status.
-
MANUAL: add that speaker notes can be used with beamer (#7856).
-
Update
build-and-upload-api-docs.sh
. -
Document
--trace
option. Documentno-check-certificate
in defaults files. Document ‘sandbox’ option for defaults files. (#7873). -
Fix pattern syntax in sample readability custom reader.
-
doc/custom-readers.lua: add example for “readable HTML.”
-
Fix message in man page about where code can be found.
-
manfilter.lua
: remove extra indent in table cells with code blocks. -
Fix lua-filters documentation for table column widths (#7864).
-
epub.doc: Update links to KindleGen (#7846, Benson Muite, Mauro Bieg). KindleGen has been deprecated and we need to link to archived versions.
-
Use tables in defaults files documentation, so each default option is paired with the corresponding command-line option (Carsten Allefeld).
-
Use skylighting 0.12.2.
-
Add pandoc-lua-marshal to Nix shell (#7849, Even Brenden).
2.17.0.1
Click to expand changelog
-
Require pandoc-lua-marshal 0.1.3.1 (#7831, Albert Krewinkel). Fixes a problem with
List.includes
andList.find
that caused a Lua stackoverflow and subsequent program crash. -
HTML template: load header-includes before math (#7833, Kolen Cheung). MathJax expect the config comes before loading the MathJax script. This change of order allows one to config MathJax via header-includes, which loads before the MathJax script. Cf. #2750.
-
When reading defaults file, stop at a line
...
. This line signals the end of a YAML document. This restores the behavior we got with HsYaml. yaml complains about content past this line. See #4627 (comment) -
Text.Pandoc.Citeproc: allow
notes-after-punctuation
to work with numerical styles that use superscripts (e.g. american-medical-association.csl), as well as with note styles. The default setting ofnotes-after-punctuation
is true for note styles and false otherwise. This restores a behavior of pandoc-citeproc that wasn’t properly carried over to Citeproc (#7826, cf. jgm/pandoc-citeproc#384). -
Use commonmark-pandoc 0.2.1.2 (#7769).
-
Add FAQ on images in ipynb containers (#7749, Kolen Cheung).
pandoc 2.17
Click to expand changelog
-
Support
markua
as an output format (#1871, Tim Wisotzki and Saumel Lemmenmeier). Markua is a markdown variant used by Leanpub. -
Add text wrapping for HTML output (#7764). Previously the HTML writer was exceptional in not being sensitive to the
--wrap
option. With this change--wrap
now works for HTML. The default (as with other formats) is automatic wrapping. Note that the contents ofscript
,textarea
, andpre
tags are always laid out with theflush
combinator, so that unwanted spaces won’t be introduced if these occur in an indented context in a template. -
Don’t read sources until in/out format are verified (#7797).
-
Issue error with
--list-extensions
for invalid formats (#7797). -
Make
--citeproc
recognize.yml
as well as.yaml
extensions as YAML bibliography files (#7707, Jörn Krenzer). -
Use latest version of KaTeX with
--katex
. -
Fix parsing of footnotes in
--metadata-file
(#7813). Previously non-inline footnotes were not being parsed. -
ODT reader:
- Parse list-header as a list item (Tuong Nguyen Manh).
-
Commonmark reader:
- Put sourcepos attribute on header, not enclosing div with
-f commonmark+sourcepos
(#7769).
- Put sourcepos attribute on header, not enclosing div with
-
Markdown reader:
- Don’t allow
^
at beginning of link or image label (#7723). This is reserved for footnotes. Fixes regression from 0a93acf. - Fix parsing of “bare locators” after author-in-text citations. Previously
@item [p. 12; @item2]
was incorrectly parsed as three citations rather than two. This is now fixed by ensuring thatprefix
doesn’t gobble any semicolons. - Revert changes to
inlinesInBalancedBrackets
(commit fa83246), which caused regressions. - Improve detection of pipe table line widths (#7713). Fixed calculation of maximum column widths in pipe tables. It is now based on the length of the markdown line, rather than a “stringified” version of the parsed line. This should be more predictable for users. In addition, we take into account double-wide characters such as emojis.
- Don’t allow
-
Custom (Lua) readers:
- First argument is now a list of sources instead of the concatenated text (Albert Krewinkel). The list structure can easily be converted to a string by applying
tostring
, but it is also possible to access the elements (each with atext
andname
). A small example is added to the custom reader documentation, showcasing its use in a reader that creates a syntax-highlighted code block for each source code file passed as input. Existing readers will still work through a fallback mechanism, issuing a deprecation notice.
- First argument is now a list of sources instead of the concatenated text (Albert Krewinkel). The list structure can easily be converted to a string by applying
-
Org reader:
- Parse official org-cite citations (#7329). We also support the older org-ref style as a fallback. We no longer support the “markdown style” or “Berkeley style” citations.
- Support alphabetical (fancy) lists (Lucas Viana). When the
fancy_lists
extension is enabled, alphabetical list markers are allowed, mimicking the behaviour of Org Mode whenorg-list-allow-alphabetical
is enabled. - Support counter cookies in lists (Lucas Viana). Such cookies are used to override the item counter in ordered lists. In org it is possible to set the counter at any list item, but since Pandoc AST does not support this, we restrict the usage to setting an offset for the entire ordered list, by using the cookie in the first list item.
- Allow trailing spaces after key/value pairs in directives (Albert Krewinkel). Ensures that spaces at the end of attribute directives like
#+ATTR_HTML: :width 100%
(note the trailing spaces) are accepted.
-
LaTeX reader:
- Omit visible content for
\label{...}
. Previously we included the text of the label in square brackets, but this is undesirable in many cases. See discussion in #813 (comment). - Improve references (#813). Resolve references to theorem environments. Remove the Span caused by “label” in figure, table, and theorem environments; this had an id that duplicated the environments’ id.
- Fix semantics of
\ref
. We were including the ams environment type in addition to the number. This is proper behavior for\cref
but not for\ref
. To support\cref
we need to store the environment label separately. - Add babel mappings for Guajati (gu) and Oriya (or) (#7815).
- Fix typo
panjabi
->punjabi
in babel mappings (#7814).
- Omit visible content for
-
HTML reader:
- Parse attributes on links and images (#6970).
-
Docx reader:
- Handle multiple pic elements inside a drawing (#7786).
- Change
elemToParPart
to return[ParPart]
instead ofParPart
. Also removeNullParPart
constructor, as it is no longer needed. This will allow us to handle elements that contain multiple ParParts, e.g.w:drawing
elements with multiplepic:pic
.
-
DocBook reader:
-
Markdown writer:
- Add new exported function
writeMarkua
from Text.Pandoc.Writers.Markdown [API change] (#1871, Tim Wisotzki and Saumel Lemmenmeier). - Fix indentation issue in footnotes (#7801).
- Avoid extra space before citation suffix if it already starts with a space.
- Ensure semicolon between the locator and the next citation when an author-in-text citation has a locator and following citations.
- Improve escaping for
#
(#7726).
- Add new exported function
-
Custom (Lua) writers:
-
Allow variables to be set via second return value of
Doc
(#6731, Albert Krewinkel). New templates variables can be added by giving variable-value pairs as a second return value of the global functionDoc
. Example:function Doc (body, meta, vars) vars.date = vars.date or os.date '%B %e, %Y' return body, vars end
-
Provide global
PANDOC_WRITER_OPTIONS
(#6731, Albert Krewinkel). -
Assign default Pandoc object to global
PANDOC_DOCUMENT
(Albert Krewinkel). The default Pandoc object is now non-strict, i.e., only the parts of the document that are accessed will be marshaled to Lua. A special type is no longer necessary. This change also makes it possible to use the global variable with library functions such aspandoc.utils.references
, or to inspect the document contents withwalk()
.
-
-
LaTeX writer:
- Fix typo
panjabi
->punjabi
in babel mappings (#7814).
- Fix typo
-
MediaWiki writer:
- Remove redundant display text for wiki links (Jesse Hathaway).
-
Docx writer:
- Handle bullets correctly in lists by not reusing numIds (#7689, Michael Hoffmann). This fixes a bug in which a Div in a list item would receive bullets on its contained paragraphs.
-
Org writer:
- Fix list items starting with a code block or other non-paragraph content (#7810).
- Avoid blank lines after tight sublists (#7810).
- Fix extra blank line inserted after empty list item (#7810).
- Don’t add blank line before lists (#7810).
- Support starting number cookies (Lucas Viana). This is necessary for lists that start at a number other than 1.
- Support the new org-cite syntax (#7329).
-
Haddock writer:
- Avoid blank lines after tight sublists (#7810).
-
Ipynb writer:
- Ensure deterministic order of keys.
- Handle cell output with raw block of markdown (#7563, Kolen Cheung). Write RawBlock of markdown in code-cell output. This is designed to fit the behavior of #7561, which makes the ipynb reader parse code-cell output with mime “text/markdown” to a RawBlock of markdown. This commit makes the ipynb writer writes this RawBlock of markdown back inside a code-cell output with the same mime, preserving this information in round-trip.
- In choosing between multiple output options, always favor those marked with the output format over images (Kolen Cheung). Previously, both
fmt == f
case and Image have a rank of 1.
-
Ipynb reader & writer: properly handle cell “id” (#7728). This is passed through if it exists (in Nb4); otherwise the writer will add a random one so that all cells have an “id”.
-
Ms writer:
- Properly encode strings for PDF contents (#7731).
-
JATS writer:
- Keep quotes in element-citations (Albert Krewinkel). Fixed a bug that lead to quote characters being lost in element-citations.
-
RTF writer:
- Properly handle images in data URIs (#7771).
-
Commonmark writer:
- Allow ‘)’ delimiters on ordered lists.
-
RST writer:
- Avoid extra blank line after empty list item (#7810).
-
HTML writer:
- Make line breaks more consistent. With
--wrap=none
, we now output line breaks between block-level elements. Previously they were omitted entirely, so the whole document was on one line, unless there were literal line breaks in pre sections. This makes the HTML writer’s behavior more consistent with that of other writers. Also, regardless of wrap settings, put newline after<dd>
and after block-level elements in the footnotes section. And add a line break between animg
tag and the associatedfigcaption
. - reveal.js: Make sure images with
r-stretch
are not in p tags. They must be direct children of the section. There was previously code to make this work with the older class namestretch
, but the name has changed in reveal.js. - reveal.js: don’t add
r-fit-text
class to section. It must go on the header only.
- Make line breaks more consistent. With
-
AsciiDoc writer:
- Improve detection of intraword emphasis (#7803).
-
OpenDocument writer:
- Fix vertical alignment bug with...
pandoc 2.16.2
Click to expand changelog
-
Add interface for custom readers written in Lua (#7669). Users can now do
-f myreader.lua
and pandoc will treat the scriptmyreader.lua
as a custom reader, which parses an input string to a pandoc AST, using the pandoc module defined for Lua filters. A sample custom reader can be found indata/creole.lua
. Also see documentation indoc/custom-readers.md
. -
New module Text.Pandoc.Readers.Custom, exporting
readCustom
[API change]. -
Allow
plain
to be used in raw attribute syntax. -
Accept empty
--metadata-file
(#7675). This was a regression from 2.15 behavior. -
Markdown reader: Improve
inlinesInBalancedBrackets
. This is just a small improvement in terms of performance, but it’s simpler and more direct code. Also, we avoid parsing interparagraph spaces in balanced brackets, as the original did. -
BibTeX reader: Properly handle commented lines in BibTeX/BibLaTeX (#7668).
-
RST reader: handle class attribute for for custom roles (#7699, willj-dev). Previously the class attribute was ignored, and the name of the role used as the class.
-
DocBook reader:
- Add
<titleabbr>
support (Rowan Rodrik van der Molen). - Support for
<indexterm>
(#7607, Rowan Rodrik van der Molen).
- Add
-
LaTeX reader:
-
JATS reader: Capture
alt-text
in figures (#7703, Aner Lucero). -
MediaWiki writer: use HTML spans for anchors when header has id (#7697). We need to generate a span when the header’s ID doesn’t match the one MediaWiki would generate automatically. Note that MediaWiki’s generation scheme is different from pandoc’s (it uses uppercase letters, and
_
instead of-
, for example). This means that in going from markdown to mediawiki, we’ll now get spans before almost every heading, unless explicit identifiers are used that correspond to the ones MediaWiki auto-generates. This is uglier output but it’s necessary for internal links to work properly. -
Markdown writer: don’t create autolinks when this loses information (#7692). Previously we sometimes lost attributes when rendering links as autolinks.
-
Text.Pandoc.Readers.Metadata: allow multiple YAML documents when parsing YAML for
yamlBsToRefs
. Some people use---
as the end delimiter in YAML bibliography files, which causes theyaml
library to emit an error unless we explicitly allow multiple YAML documents (and just consider the first). -
JATS writer:
- Ensure figures are wrapped with
<p>
in list items (Albert Krewinkel). This prevents the generation of invalid output. - Add URL to element citation entries (Albert Krewinkel). The URL of a reference, if present, is added in tag
<uri>
to element-citation entries.
- Ensure figures are wrapped with
-
HTML writer: Don’t create invalid
data-
attribute for empty attribute key (#7546). -
LaTeX writer:
- Babel mappings: use
ancientgreek
forgrc
. - With
-t latex-smart
, don’t generate\ldots
from ellipsis (#7674). Instead just use unicode ellipsis.
- Babel mappings: use
-
JATS template: fix
equal-contrib
attribute (Albert Krewinkel). The standard requires the value to be eitheryes
orno
, but is was set totrue
for authors who contributed equally. -
reveal.js template: Add
disableLayout
variable (Christophe Dervieux). -
Text.Pandoc.Error: sort errors in
handleError
by exit code (Albert Krewinkel). -
Text.Pandoc.Writers.Shared: Improve toLegacyTable (#7683, Christian Despres).
-
Lua subsystem:
-
Include lpeg module (#7649, Albert Krewinkel). Compiles the
lpeg
library (Parsing Expression Grammars For Lua) into the program. Package maintainers may choose to rely on package dependencies to make lpeg available, in which case they can compile the with the constraintlpeg +rely-on-shared-lpeg-library
.lpeg
andre
are always made available in global variables, without the need for arequire
. -
Set
lpeg
andre
as globals; allow shared lib access viarequire
. Thelpeg
andre
modules are loaded into globals of the respective name, but they are not necessarily registered as loaded packages. This ensures that- the built-in library versions are preferred when setting the globals,
- a shared library is used if pandoc has been compiled without
lpeg
, and - the
require
mechanism can be used to load the shared library if available, falling back to the internal version if possible and necessary.
-
Fix argument order in constructor
pandoc.Cite
(Albert Krewinkel). This restores the old behavior; argument order had been switched accidentally in pandoc 2.15. -
Add Pushable instance for
ReaderOptions
(Albert Krewinkel). -
Allow to pass custom reader options to
pandoc.read
as an optional third argument (#7656, Albert Krewinkel). The object can either be a table or a ReaderOptions value likePANDOC_READER_OPTIONS
. Creating new ReaderOptions objects is possible through the new constructorpandoc.ReaderOptions
. -
Display Pandoc values using their native Haskell representation (Albert Krewinkel).
-
Require latest hslua (2.0.1) (#7661, #7657, Albert Krewinkel). This fixes issues with
- misleading error messages when a required function parameter is omitted;
- absent properties still being listed in the output of
pairs
; and - alias accessing leading to errors instead of returning
nil
, e.g. with(pandoc.Str '').identifier
.
-
Add missing space in “package not found” message (#7658, Albert Krewinkel).
-
-
Update build files (#7696, Fabián Heredia Montiel). Drop old windows 32-bit constraints. Update cabal
tested-with
field to correspond toci.yml
matrix -
Remove unneeded package dependencies from benchmark target.
-
Require ghc >= 8.6, base >= 4.12. This allows us to get rid of the old custom prelude and some crufty cpp. But the primary reason for this is that conduit has bumped its base lower bound to 4.12, making it impossible for us to support lower base versions.
-
Require Cabal 2.4. Use wildcards to ensure that all pptx tests are included (#7677).
-
Update
bash_completion.tpl
(S.P.H.). -
Add
data/creole.lua
as sample custom reader. -
Add
doc/custom-readers.md
anddoc/custom-writers.md
. -
doc/lua-filters.md
: add section on global modules, including lpeg (Albert Krewinkel). -
MANUAL.txt
: update table of exit codes and corresponding errors (Albert Krewinkel). -
Use latest texmath.