Releases: jgm/pandoc
pandoc 1.13.2
This is mainly a spit-and-polish release, though there is one new reader and some minor new features. Note that, for the first time, we are providing a linux binary (64-bit Debian/Ubuntu).
- TWiki Reader: add new new twiki reader (API chaneg, Alexander Sulfrian).
- Markdown reader:
- Better handling of paragraph in div (#1591). Previously text that ended a div would be parsed as Plain unless there was a blank line before the closing div tag.
- Don't treat a citation as a reference link label (#1763).
- Fix autolinks with following punctuation (#1811). The price of this is that autolinked bare URIs can no longer contain
>
characters, but this is not a big issue. - Fix
Ext_lists_without_preceding_blankline
bug (#1636, Artyom). - Allow
startnum
to work withoutfancy_lists
. Formerlypandoc -f markdown-fancy_lists+startnum
did not work properly. - RST reader (all Daniel Bergey):
- Parse quoted literal blocks (#65). RST quoted literal blocks are the same as indented literal blocks (which pandoc already supports) except that the quote character is preserved in each line.
- Parse RST class directives. The class directive accepts one or more class names, and creates a Div value with those classes. If the directive has an indented body, the body is parsed as the children of the Div. If not, the first block folowing the directive is made a child of the Div. This differs from the behavior of rst2xml, which does not create a Div element. Instead, the specified classes are applied to each child of the directive. However, most Pandoc Block constructors to not take an Attr argument, so we can't duplicate this behavior.
- Warn about skipped directives.
- Literal role now produces Code. Code role should have "code" class.
- Improved support for custom roles
- AddedsourceCode
to classes for:code:
role, and anything inheriting from it.
- Add the name of the custom role to classes if the Inline constructor supports Attr.
- If the custom role directive does not specify a parent role, inherit from the:span:
role.
This differs somewhat from the rst2xml.py
behavior. If a custom role inherits from another custom role, Pandoc will attach both roles' names as classes. rst2xml.py
will only use the class of the directly invoked role (though in the case of inheriting from a :code:
role with a :language:
defined, it will also provide the inherited language as a class).
- Warn about ignored fields in role directives.
- LaTeX reader:
- Parse label after caption into a span instead of inserting an additional paragraph of bracketed text (#1747).
- Parse math environments as inline when possible (#1821).
- Better handling of
\noindent
and\greektext
(#1783). - Handle
\texorpdfstring
more gracefully. - Handle
\cref
and\sep
(Wikiwide). - Support
\smartcite
and\Smartcite
from biblatex. - HTML reader:
- Retain display type of MathML output (#1719, Matthew Pickering).
- Recognise
<br>
tags inside<pre>
blocks (#1620, Matthew Pickering). - Make
embed
tag either block or inline (#1756). - DocBook reader:
- Handle
keycombo
,keycap
(#1815). - Get string content in inner tags for literal elements (#1816).
- Handle
menuchoice
elements better, with a>
between (#1817). - Include
id
on section headers (#1818). - Document/test "type" as implemented (Brian O'Sullivan).
- Add support for calloutlist and callout (Brian O'Sullivan). We treat a calloutlist as a bulleted list. This works well in practice.
- Add support for
classname
(Bryan O'Sullivan). - Docx reader:
- Fix window path for image lookup (Jesse Rosenthal). Don't use os-sensitive "combine", since we always want the paths in our zip-archive to use forward-slashes.
- Single-item headers in ordered lists are headers (Jesse Rosenthal). When users number their headers, Word understands that as a single item enumerated list. We make the assumption that such a list is, in fact, a header.
- Rewrite rewriteLink to work with new headers (Jesse Rosenthal). There could be new top-level headers after making lists, so we have to rewrite links after that.
- Use polyglot header list (Jesse Rosenthal). We're just keeping a list of header formats that different languages use as their default styles. At the moment, we have English, German, Danish, and French. We can continue to add to this. This is simpler than parsing the styles file, and perhaps less error-prone, since there seems to be some variations, even within a language, of how a style file will define headers.
- Remove header class properly in other langs (Jesse Rosenthal). When we encounter one of the polyglot header styles, we want to remove that from the par styles after we convert to a header. To do that, we have to keep track of the style name, and remove it appropriately.
- Account for external link URLs with anchors. Previously, if a URL had an anchor, the reader would incorrectly identify it as an internal link and return only the anchor as URL. (Caleb McDaniel)
- Fix for Issue #1692 (i18n styles) (Nikolay Yakimov).
- Org reader:
- Added state changing blanklines (Jesse Rosenthal). This allows us to emphasize at the beginning of a new paragraph (or, in general, after blank lines).
- Fixed bug with bulleted lists:
- a
- b
- c
was being parsed as a list, even though an unindented *
should make a heading. See http://orgmode.org/manual/Plain-lists.html#fn-1.
- Org reader: absolute, relative paths in link (#1741, Albert Krewinkel). The org reader was too restrictive when parsing links; some relative links and links to files given as absolute paths were not recognized correctly.
- Org reader: allow empty links (jgm/gitit#471, Albert Krewinkel). This is important for use in gitit, which uses empty links for wikilinks.
- Respect indent when parsing Org bullet lists (#1650, Timothy Humphries). Fixes issue with top-level bullet list parsing.
- Fix indent issue for definition lists (Timothy Humphries, see #1650, #1698, #1680).
- Parse multi-inline terms correctly in definition list (#1649, Matthew Pickering).
- Fix rules for emphasis recognition (Albert Krewinkel). Things like
/hello,/
or/hi'/
were falsy recognized as emphasised strings. This is wrong, as,
and'
are forbidden border chars and may not occur on the inner border of emphasized text. - Drop COMMENT document trees (Albert Krewinkel). Document trees under a header starting with the word
COMMENT
are comment trees and should not be exported. Those trees are dropped silently (#1678). - Properly handle links to
file:target
(Albert Krewinkel). Org links like[[file:target][title]]
were not handled correctly, parsing the link target verbatim. The org reader is changed such that the leadingfile:
is dropped from the link target (see #756, #1812). - Parse LaTeX-style MathML entities (#1657, Albert Krewinkel). Org supports special symbols which can be included using LaTeX syntax, but are actually MathML entities. Examples for this are
\nbsp
(non-breaking space),\Aacute
(the letter A with accent acute) or\copy
(the copyright sign ©) - EPUB reader:
- URI handling improvements. Now we outsource most of the work to
fetchItem'
. Also, do not include queries in file extensions (#1671). - LaTeX writer:
- Use
\texorpdfstring
for section captions when needed (Vaclav Zeman). - Handle consecutive linebreaks (#1733).
- Protect graphics in headers (Jesse Rosenthal). Graphics in
\section
/\subsection
etc titles need to be\protect
ed. - Put
~
before header in list item text (Jesse Rosenthal). Because of the built-in line skip, LaTeX can't handle a section header as the first element in a list item. - Avoid using reserved characters as
\lstinline
delimiters (#1595). - Better handling of display math in simple tables (#1754). We convert display math to inline math in simple tables, since LaTeX can't deal with display math in simple tables.
- Escape spaces in code (#1694, Bjorn Buckwalter).
- MediaWiki writer:
- Fixed links with URL = text. Previously these were rendered as bare words, even if the URL was not an absolute URL (#1825).
- ICML writer:
- Don't force all citations into footnotes.
- RTF writer:
- Add blankline at end of output (#1732, Matthew Pickering).
- RST writer:
- Ensure blank line after figure.
- Avoid exces whitespace after last list item (#1777).
- Wrap line blocks with spaces before continuations (#1656).
- Fixed double-rendering of footnotes in RST tables (#1769).
- DokuWiki writer:
- Better handling of block quotes. This change ensures that multiple paragraph blockquotes are rendered using native
>
rather than as HTML (#1738). - Fix external images (#1739). Preface relative links with ":", absolute URIs without. (Timothy Humphries)
- HTML writer:
- Use protocol-relative URL for mathjax.
- Put newline btw img and caption paragraph.
- MathML now outputted with tex annotation (#1635, Matthew Pickering).
- Add support for KaTeX HTML math (#1626, Matthew Pickering). This adds
KaTeX
toHTMLMathMethod
(API change). - Don't double render when
email-obfuscation=none
(#1625, Matthew Pickering). - Make header attributes work outside top level (#1711). Previously they only appeared on top level header elements. Now they work e.g. in blockquotes.
- ODT writer:
- Correctly handle images without extensions (#1729).
- Strip querystring in ODT write (#1682, Todd Sifleet).
- FB2 writer:
- Add newline to output.
- EPUB writer:
- Don't add
sourceURL
to absolute URIs (#1669). - Don't use unsupported
opf:title-type
for epub2. - Include "landmarks" section in nav document for epub3 (#1757).
- Removed playOrder from navpoint elements in ncx ...
pandoc 1.13.1
- Fixed
--self-contained
with Windows paths (#1558). PreviouslyC:\foo.js
was being wrongly interpreted as a URI. - HTML reader: improved handling of tags that can be block or inline. Previously a section like this would be enclosed in a paragraph, with RawInline for the video tags (since video is a tag that can be either block or inline):
<video controls="controls">
<source src="../videos/test.mp4" type="video/mp4" />
<source src="../videos/test.webm" type="video/webm" />
<p>
The videos can not be played back on your system.<br/>
Try viewing on Youtube (requires Internet connection):
<a href="http://youtu.be/etE5urBps_w">Relative Velocity on
Youtube</a>.
</p>
</video>
This change will cause the video and source tags to be parsed as RawBlock instead, giving better output. The general change is this: when we're parsing a "plain" sequence of inlines, we don't parse anything that COULD be a block-level tag.
- Docx reader:
- Be sensitive to user styles. Note that "Hyperlink" is "blacklisted," as we don't want the default underline styling to be inherited by all links by default (Jesse Rosenthal).
- Read single paragraph in table cell as
Plain
(Jesse Rosenthal). This makes to docx reader's native output fit with the way the markdown reader understands its markdown output. - Textile writer: Extended the range of cases where native textile tables will be used (as opposed to raw HTML): we now handle any alignment type, but only for simple tables with no captions.
- Txt2Tags reader:
- Header is now parsed only if standalone flag is set (Matthew Pickering).
- The header is now parsed as meta information. The first line is the
title
, the second is theauthor
and third line is thedate
(Matthew Pickering). - Corrected formatting of
%%mtime
macro (Matthew Pickering). - Fixed crash when reading from stdin.
- EPUB writer: Don't use page-progression-direction in EPUB2, which doesn't support it. Also, if page-progression-direction not specified in metadata, don't include the attribute even in EPUB3; not including it is the same as including it with the value "default", as we did before. (#1550)
- Org writer: Accept example lines with indentation at the beginning (Calvin Beck).
- DokuWiki writer:
- Refactor to use Reader monad (Matthew Pickering).
- Avoid using raw HTML in table cells; instead, use
\\
instead of newlines (Jesse Rosenthal). - Properly handle HTML table cell alignments, and use spacing to make the tables look prettier (#1566).
- Docx writer:
- Bibliography entries get
Bibliography
style (#1559). - Implement change tracking (Jesse Rosenthal).
- LaTeX writer:
- Fixed a bug that caused a table caption to repeat across all pages (Jose Luis Duran).
- Improved vertical spacing in tables and made it customizable using standard lengths set by booktab. See https://groups.google.com/forum/#!msg/pandoc-discuss/qMu6_5lYy0o/ZAU7lzAIKw0J (Jose Luis Duran).
- Added
\strut
to fix spacing in multiline tables (Jose Luis Duran). - Use
\tabularnewline
instead of\\
in table cells (Jose Luis Duran). - Made horizontal rules more flexible (Jose Luis Duran).
- Text.Pandoc.MIME:
- Added
MimeType
(type synonym forString
) andgetMimeTypeDef
. Code cleanups (Artyom Kazak). - Templates:
- LaTeX template: disable microtype protrusion for typewriter font (#1549, thanks lemzwerg).
- Improved OSX build procedure.
- Added
network-uri
flag, to deal with split ofnetwork-uri
fromnetwork
. - Fix build dependencies for the
trypandoc
flag, so that they are ignored iftrypandoc
flag is set to False (Gabor Pali). - Updated README to remove outdated claim that
--self-contained
looks in the user data directory for missing files.
pandoc 1.13.0.1
This release fixes a couple of serious regressions in 1.13.
- Docx writer:
- Fixed regression which bungled list numbering (#1544), causing all lists to appear as basic ordered lists.
- Include row width in table rows (Christoffer Ackelman, Viktor Kronvall). Added a property to all table rows where the sum of column widths is specified in pct (fraction of 5000). This helps persuade Word to lay out the table with the widths we specify.
- Fixed a bug in Windows 8 which caused pandoc not to find the
pandoc-citeproc
filter (#1542). - Docx reader: miscellaneous under-the-hood improvements (Jesse Rosenthal). Most significantly, the reader now uses Builder, leading to some performance improvements.
- HTML reader: Parse appropriately styled span as SmallCaps.
- Markdown writer: don't escape
$
,^
,~
whentex_math_dollars
,superscript
, andsubscript
extensions, respectively, are deactivated (#1127). - Added
trypandoc
flag to build CGI executable used in the online demo. - Makefile: Added 'quick', 'osxpkg' targets.
- Updated README in templates to indicate templates license. The templates are dual-licensed, BSD3 and GPL2+.
pandoc 1.13
New features
- Added
docx
as an input format (Jesse Rosenthal). The docx reader includes conversion of native Word equations to pandoc LaTeXMath
elements. Metadata is taken from paragraphs at the beginning of the document with stylesAuthor
,Title
,Subtitle
,Date
, andAbstract
. - Added
epub
as an input format (Matthew Pickering). The epub reader includes conversion of MathML to pandoc LaTeXMath
elements. - Added
t2t
(Txt2Tags) as an input format (Matthew Pickering). Txt2tags is a lightweight markup format described at http://txt2tags.org/. - Added
dokuwiki
as an output format (Clare Macrae). - Added
haddock
as an output format. - Added
--extract-media
option to extract media contained in a zip container (docx or epub) while adjusting image paths to point to the extracted images. - Added a new markdown extension,
compact_definition_lists
, that restores the syntax for definition lists of pandoc 1.12.x, allowing tight definition lists with no blank space between items, and disallowing lazy wrapping. (See below under behavior changes.) - Added an extension
epub_html_exts
for parsing HTML in EPUBs. - Added extensions
native_spans
andnative_divs
to activate parsing of material in HTML span or div tags as Pandoc Span inlines or Div blocks. --trace
now works with the Markdown, HTML, Haddock, EPUB, Textile, and MediaWiki readers. This is an option intended for debugging parsing problems; ordinary users should not need to use it.
Behavior changes
- Changed behavior of the
markdown_attribute
extension, to bring it in line with PHP markdown extra and multimarkdown. Settingmarkdown="1"
on an outer tag affects all contained tags, recursively, until it is reversed withmarkdown="0"
(#1378). - Revised markdown definition list syntax (#1429). Both the reader and writer are affected. This change brings pandoc's definition list syntax into alignment with that used in PHP markdown extra and multimarkdown (with the exception that pandoc is more flexible about the definition markers, allowing tildes as well as colons). Lazily wrapped definitions are now allowed. Blank space is required between list items. The space before a definition is used to determine whether it is a paragraph or a "plain" element. WARNING: This change may break existing documents! Either check your documents for definition lists without blank space between items, or use
markdown+compact_definition_lists
for the old behavior. .numberLines
now works in fenced code blocks even if no language is given (#1287, jgm/highlighting-kate#40).- Improvements to
--filter
: - Don't search PATH for a filter with an explicit path. This fixed a bug wherein
--filter ./caps.py
would runcaps.py
from the system path, even if there was acaps.py
in the working directory. - Respect shebang if filter is executable (#1389).
- Don't print misleading error message. Previously pandoc would say that a filter was not found, even in a case where the filter had a syntax error.
- HTML reader:
- Parse
div
andspan
elements even without--parse-raw
, providednative_divs
andnative_spans
extensions are set. Motivation: these now generate native pandoc Div and Span elements, not raw HTML. - Parse EPUB-specific elements if the
epub_html_exts
extension is enabled. These includeswitch
,footnote
,rearnote
,noteref
. - Org reader:
- Support for inline LaTeX. Inline LaTeX is now accepted and parsed by the org-mode reader. Both math symbols (like
\tau
) and LaTeX commands (like\cite{Coffee}
), can be used without any further escaping (Albert Krewinkel). - Textile reader and writer:
- The
raw_tex
extension is no longer set by default. You can enable it withtextile+raw_tex
. - DocBook reader:
- Support
equation
,informalequation
,inlineequation
elements withmml:math
content. This is converted into LaTeX and put into a Pandoc Math inline. - Revised
plain
output, largely following the style of Project Gutenberg: - Emphasis is rendered with
_underscores_
, strong emphasis with ALL CAPS. - Headings are rendered differently, with space to set them off, not with setext style underlines. Level 1 headers are ALL CAPS.
- Math is rendered using unicode when possible, but without the distracting emphasis markers around variables.
- Footnotes use a regular
[n]
style. - Markdown writer:
- Horizontal rules are now a line across the whole page.
- Prettier pipe tables. Columns are now aligned (#1323).
- Respect the
raw_html
extension.pandoc -t markdown-raw_html
no longer emits any raw HTML, including span and div tags generated by Span and Div elements. - Use span with style for
SmallCaps
(#1360). - HTML writer:
- Autolinks now have class
uri
, and email autolinks have classemail
, so they can be styled. - Docx writer:
- Document formatting is carried over from
reference.docx
. This includes margins, page size, page orientation, header, and footer, including images in headers and footers. - Include abstract (if present) with
Abstract
style (#1451). - Include subtitle (if present) with
Subtitle
style, rather than tacking it on to the title (#1451). - Org writer:
- Write empty span elements with an id attribute as org anchors. For example
Span ("uid",[],[]) []
becomes<<uid>>
. - LaTeX writer:
- Put table captions above tables, to match the conventional standard. (Previously they appeared below tables.)
- Use
\(..\)
instead of$..$
for inline math (#1464). - Use
\nolinkurl
in email autolinks. This allows them to be styled using\urlstyle{tt}
. Thanks to Ulrike Fischer for the solution. - Use
\textquotesingle
for'
in inline code. Otherwise we get curly quotes in the PDF output (#1364). - Use
\footnote<.>{..}
for notes in beamer, so that footnotes do not appear before the overlays in which their markers appear (#1525). - Don't produce a
\label{..}
for a Div or Span element. Do produce a\hyperdef{..}
(#1519). - EPUB writer:
- If the metadata includes
page-progression-direction
(which can beltr
orrtl
, thepage-progression-direction
attribute will be set in the EPUB spine (#1455). - Custom lua writers:
- Custom writers now work with
--template
. - Removed HTML header scaffolding from
sample.lua
. - Made citation information available in lua writers.
--normalize
andText.Pandoc.Shared.normalize
now consolidate adjacentRawBlock
s when possible.
API changes
- Added
Text.Pandoc.Readers.Docx
, exportingreadDocx
(Jesse Rosenthal). - Added
Text.Pandoc.Readers.EPUB
, exportingreadEPUB
(Matthew Pickering). - Added
Text.Pandoc.Readers.Txt2Tags
, exportingreadTxt2Tags
(Matthew Pickering). - Added
Text.Pandoc.Writers.DokuWiki
, exportingwriteDokuWiki
(Clare Macrae). - Added
Text.Pandoc.Writers.Haddock
, exportingwriteHaddock
. - Added
Text.Pandoc.MediaBag
, exportingMediaBag
,lookupMedia
,insertMedia
,mediaDirectory
,extractMediaBag
. The docx and epub readers return a pair of aPandoc
document and aMediaBag
with the media resources they contain. This can be extracted using--extract-media
. Writers that incorporate media (PDF, Docx, ODT, EPUB, RTF, or HTML formats with--self-contained
) will look for resources in theMediaBag
generated by the reader, in addition to the file system or web. Text.Pandoc.Readers.TexMath
: Removed deprecatedreadTeXMath
. RenamedreadTeXMath'
totexMathToInlines
.Text.Pandoc
: AddedReader
data type (Matthew Pickering).readers
now associates names of readers withReader
structures. This allows inclusion of readers, like the docx reader, that take binary rather than textual input.Text.Pandoc.Shared
:- Added
capitalize
(Artyom Kazak), and replaced uses ofmap toUpper
(which give bad results for many languages). - Added
collapseFilePath
, which removes intermediate.
and..
from a path (Matthew Pickering). - Added
fetchItem'
, which works likefetchItem
but searches aMediaBag
before looking on the net or file system. - Added
withTempDir
. - Added
removeFormatting
. - Added
extractSpaces
(from HTML reader) and generalized its type so that it can be used by the docx reader (Matthew Pickering). - Added
ordNub
. - Added
normalizeInlines
,normalizeBlocks
. normalize
is nowPandoc -> Pandoc
instead ofData a :: a -> a
. Some users may need to change their uses ofnormalize
to the newly exportednormalizeInlines
ornormalizeBlocks
.Text.Pandoc.Options
:- Added
writerMediaBag
toWriterOptions
. - Removed deprecated and no longer used
readerStrict
inReaderOptions
. This is handled byreaderExtensions
now. - Added
Ext_compact_definition_lists
. - Added
Ext_epub_html_exts
. - Added
Ext_native_divs
andExt_native_spans
. This allows users to turn off the default pandoc behavior of parsing contents of div and span tags in markdown and HTML as native pandoc Div blocks and Span inlines. Text.Pandoc.Parsing
:- Generalized
readWith
toreadWithM
(Matthew Pickering). - Export
runParserT
andStream
(Matthew Pickering). - Added
HasQuoteContext
type class (Matthew Pickering). - Generalized types of
mathInline
,smartPunctuation
,quoted
,singleQuoted
,doubleQuoted
,failIfInQuoteContext
,applyMacros
(Matthew Pickering). - Added custom
token
(Matthew Pickering). - Added
stateInHtmlBlock
toParserState
. This is used to keep track of the ending tag we're waiting for when we're parsing inside HTML block tags. - Added
stateMarkdownAttribute
toParserState
. This is used to keep track of whether the markdown attribute has been set in an enclosing tag. - Generalized type of
registerHeader
, using new type classesHasReaderOptions
, `...
pandoc 1.12.4.2
- Require highlighting-kate >= 0.5.8. Fixes a performance regression.
- Shared:
addMetaValue
now behaves slightly differently: if both the new and old values are lists, it concatenates their contents to form a new list. - LaTeX reader:
- Set
bibliography
in metadata from\bibliography
or\addbibresource
command. - Don't error on
%foo
with no trailing newline. - Org reader:
- Support code block headers (
#+BEGIN_SRC ...
) (Albert Krewinkel). - Fix parsing of blank lines within blocks (Albert Krewinkel).
- Support pandoc citation extension (Albert Krewinkel). This can be turned off by specifying
org-citation
as the input format. - Markdown reader:
citeKey
moved toText.Pandoc.Parsing
so it can be used by other readers (Albert Krewinkel).Text.Pandoc.Parsing
:- Added
citeKey
(see above). - Added
HasLastStrPosition
type class andupdateLastStrPos
andnotAfterString
functions. - Updated copyright notices (Albert Krewinkel).
- Added default.icml to data files so it installs with the package.
- OSX package:
- The binary is now built with options to ensure that it can be used with OSX 10.6+.
- Moved OSX package materials to osx directory.
- Added OSX package uninstall script, included in the zip container (thanks to Daniel T. Staal).
pandoc 1.12.4
- Made it possible to run filters that aren't executable (#1096).
Pandoc first tries to find the executable (searching the path
if path isn't given). If it fails, but the file exists and has
a.py
,.pl
,.rb
,.hs
, or.php
extension, pandoc runs the filter
using the appropriate interpreter. This should make it easier to
use filters on Windows, and make it more convenient for everyone. - Added Emacs org-mode reader (Albert Krewinkel).
- Added InDesign ICML Writer (mb21).
- MediaWiki reader:
- LaTeX reader:
- Give better location information on errors, pointing to line
numbers within included files (#1274). - LaTeX reader: Better handling of
table
environment (#1204).
Positioning options no longer rendered verbatim. - Better handling of figure and table with caption (#1204).
- Handle
@{}
andp{length}
in tabular. The length is not actually
recorded, but at least we get a table (#1180). - Properly handle
\nocite
. It now adds anocite
metadata
field. Citations there will appear in the bibliography but not
in the text (unless you explicitly put a$nocite$
variable
in your template).
- Give better location information on errors, pointing to line
- Markdown reader:
- Ensure that whole numbers in YAML metadata are rendered without
decimal points. (This became necessary with changes to aeson
and yaml libraries. aeson >= 0.7 and yaml >= 0.8.8.2 are now required.) - Fixed regression on line breaks in strict mode (#1203).
- Small efficiency improvements.
- Improved parsing of nested
div
s. Formerly a closingdiv
tag
would be missed if it came right after other block-level tags. - Avoid backtracking when closing
</div>
not found. - Fixed bug in reference link parsing in
markdown_mmd
. - Fixed a bug in list parsing (#1154). When reading a raw list
item, we now strip off up to 4 spaces. - Fixed parsing of empty reference link definitions (#1186).
- Made one-column pipe tables work (#1218).
- Ensure that whole numbers in YAML metadata are rendered without
- Textile reader:
- Better support for attributes. Instead of being ignored, attributes
are now parsed and included in Span inlines. The output will be a bit
different from stock textile: e.g. for*(foo)hi*
, we'll get
<em><span class="foo">hi</span></em>
instead of
<em class="foo">hi</em>
. But at least the data is not lost. - Improved treatment of HTML spans (%) (#1115).
- Improved link parsing. In particular we now pick up on attributes.
Since pandoc links can't have attributes, we enclose the whole link in
a span if there are attributes (#1008). - Implemented correct parsing rules for inline markup (#1175, Matthew
Pickering). - Use Builder (Matthew Pickering).
- Better support for attributes. Instead of being ignored, attributes
- DocBook reader:
- Better treatment of
formalpara
. We now emit the title (if present)
as a separate paragraph with boldface text (#1215). - Set metadata
author
notauthors
. - Added recognition of
authorgroup
andreleaseinfo
elements (#1214,
Matthew Pickering). - Converted current meta information parsing in DocBook to a more
extensible version which is aware of the more recent meta
representation (Matthew Pickering).
- Better treatment of
- HTML reader:
- Require tagsoup 0.13.1, to fix a bug with parsing of script tags
(#1248). - Treat processing instructions & declarations as block. Previously
these were treated as inline, and included in paragraph tags in HTML
or DocBook output, which is generally not what is wanted (#1233). - Updated
closes
with rules from HTML5 spec. - Use Builder (Matthew Pickering, #1162).
- Require tagsoup 0.13.1, to fix a bug with parsing of script tags
- RST reader:
- Remove duplicate
http
in PEP links (Albert Krewinkel). - Make rst figures true figures (#1168, CasperVector)
- Enhanced Pandoc's support for rST roles (Merijn Verstaaten).
rST parser now supports: all built-in rST roles, new role definition,
role inheritance, though with some limitations. - Use
author
rather thanauthors
in metadata. - Better handling of directives. We now correctly handle field
lists that are indented more than three spaces. We treat an
aafig
directive as a code block with attributes, so it can be
processed in a filter (#1212).
- Remove duplicate
- LaTeX writer:
- Mark span contents with label if span has an ID (Albert Krewinkel).
- Made
--toc-depth
work well with books in latex/pdf output (#1210). - Handle line breaks in simple table cells (#1217).
- Workaround for level 4-5 headers in quotes. These previously produced
invalid LaTeX:\paragraph
or\subparagraph
in aquote
environment.
This adds anmbox{}
in these contexts to work around the problem.
See http://tex.stackexchange.com/a/169833/22451 (#1221). - Use
\/
to avoid en-dash ligature instead of-{}-
(Vaclav Zeman).
This is to fix LuaLaTeX output. The-{}-
sequence does not avoid the
ligature with LuaLaTeX but\/
does. - Fixed string escaping in
hyperref
andhyperdef
(#1130).
- ConTeXt writer: Improved autolinks (#1270).
- DocBook writer:
- Improve handling of hard line breaks in Docbook writer
(Neil Mayhew). Use a<literallayout>
for the entire paragraph, not
just for the newline character. - Don't let line breaks inside footnotes influence the enclosing
paragraph (Neil Mayhew). - Distinguish tight and loose lists in DocBook output, using
spacing="compact"
(Neil Mayhew, #1250).
- Improve handling of hard line breaks in Docbook writer
- Docx writer: When needed files are not present in the user's
reference.docx
, fall back on the versions in thereference.docx
in pandoc's data files. This fixes a bug that occurs when a
reference.docx
saved by LibreOffice is used. (#1185) - EPUB writer:
- Include extension in epub ids. This fixes a problem with duplicate
extensions for fonts and images with the same base name but different
extensions (#1254). - Handle files linked in raw
img
tags (#1170). - Handle media in
audio
source tags (#1170).
Note that we now use amedia
directory rather thanimages
. - Incorporate files linked in
video
tags (#1170).src
andposter
will both be incorporated intocontent.opf
and the epub container.
- Include extension in epub ids. This fixes a problem with duplicate
- HTML writer:
- Add colgroup around col tags (#877). Also affects EPUB writer.
- Fixed bug with unnumbered section headings. Unnumbered section
headings (with classunnumbered
) were getting numbers. - Improved detection of image links. Previously image links with
queries were not recognized, causing<embed>
to be used instead
of<img>
.
- Man writer: Ensure that terms in definition lists aren't line wrapped
(#1195). - Markdown writer:
- Use proper escapes to avoid unwanted lists (#980). Previously we used
0-width spaces, an ugly hack. - Use longer backtick fences if needed (#1206). If the content contains a
backtick fence and there are attributes, make sure longer fences are
used to delimit the code. Note: This works well in pandoc, but github
markdown is more limited, and will interpret the first string of three
or more backticks as ending the code block.
- Use proper escapes to avoid unwanted lists (#980). Previously we used
- RST writer: Avoid stack overflow with certain tables (#1197).
- RTF writer: Fixed table cells containing paragraphs.
- Custom writer:
- AsciiDoc writer: Handle multiblock and empty table cells
(#1245, #1246). Added tests. Text.Pandoc.Options
: AddedreaderTrace
toReaderOptions
Text.Pandoc.Shared
:- Added
compactify'DL
(formerly in markdown reader) (Albert Krewinkel). - Fixed bug in
toRomanNumeral
: numbers ending with '9' would
be rendered as Roman numerals ending with 'IXIV' (#1249). Thanks to
Jesse Rosenthal. openURL
: set proxy with value of http_proxy env variable (#1211).
Note: proxies with non-root paths are not supported, due to
limitations inhttp-conduit
.
- Added
Text.Pandoc.PDF
:- Ensure that temp directories deleted on Windows (#1192). The PDF is
now read as a strict bytestring, ensuring that process ownership will
be terminated, so the temp directory can be deleted. - Use
/
as path separators in a few places, even on Windows.
This seems to be necessary for texlive (#1151, thanks to Tim Lin). - Use
;
forTEXINPUTS
separator on Windows (#1151). - Changes to error reporting, to handle non-UTF8 error output.
- Ensure that temp directories deleted on Windows (#1192). The PDF is
Text.Pandoc.Templates
:- Removed unneeded datatype context (Merijn Verstraaten).
- YAML objects resolve to "true" in conditionals (#1133).
Note: Ifaddress
is a YAML object and you just have$address$
in your template, the wordtrue
will appear, which may be
unexpected. (Previously nothing would appear.)
Text.Pandoc.SelfContained
: Handleposter
attribute invideo
tags (#1188).Text.Pandoc.Parsing
:- Made
F
an instance of Applicative (#1138). - Added
stateCaption
. - Added
HasMacros
, simplified other typeclasses.
RemovedupdateHeaderMap
,setHeaderMap
,getHeaderMap
,
updateIdentifierList
,setIdentifierList
,getIdentifierList
. - Changed the smart punctuation parser to return
Inlines
rather thanInline
(Matthew Pickering). - Changed
HasReaderOptions
,HasHeaderMap
,HasIdentifierList
from typeclasses of monads to typeclasses of states. This simplifies
the instance definitions and provides more flexibility. Generalized
type ofgetOption
and added a default definition. Removed
askReaderOption
. AddedextractReaderOption
. Added
extractHeaderMap
andupdateHeaderMap
inHasHeaderMap
.
...
- Made
pandoc 1.12.3
- The
--bibliography
option now sets thebiblio-files
variable. So, if you're using--natbib
or--biblatex
, you can just use--bibliography=foo.bib
instead of-V bibliofiles=foo
. - Don't run pandoc-citeproc filter if
--bibliography
is used together with--natbib
or--biblatex
(Florian Eitel). - Template changes:
- Updated beamer template to include booktabs.
- Added
abstract
variable to LaTeX template. - Put
header-includes
aftertitle
in LaTeX template (#908). - Allow use of
\includegraphics[size]
in beamer. This just required porting a macro definition from the default LaTeX template to the default beamer template. reference.docx
: IncludeFootnoteText
style. Otherwise Word ignores the style, even when specified in thepPr
. (#901)reference.odt
: Tidiedstyles.xml
.- Relaxed version bounds for dependencies.
- Added
withSocketsDo
around http conduit code inopenURL
, so it works on Windows (#1080). - Added
Cite
function tosample.lua
. - Markdown reader:
- Fixed regression in title blocks (#1089). If author field was empty, date was being ignored.
- Allow backslash-newline hard line breaks in grid and multiline table cells.
- Citation keys may now start with underscores, and may contain underscores adjacent to internal punctuation.
- LaTeX reader:
- Add support for
Verb
macro (jrnold) (#1090). - Support babel-style quoting:
"
..."'`. - Properly handle script blocks in strict mode. (That is,
markdown-markdown_in_html_blocks
.) Previously a spurious<p>
tag was being added (#1093). - Docbook reader: Avoid failure if
tbody
contains notr
orrow
elements. - LaTeX writer:
- Factored out function for table cell creation.
- Better treatment of footnotes in tables. Notes now appear in the regular sequence, rather than in the table cell. (This was a regression in 1.10.)
- HTML reader: Parse name/content pairs from meta tags as metadata. Closes #1106.
- Moved
fixDisplayMath
from Docx writer toWriter.Shared
. - OpenDocument writer: Fixed
RawInline
,RawBlock
so they don't escape. - ODT writer: Use mathml for proper rendering of formulas. Note: LibreOffice's support for this seems a bit buggy. But it should be better than what we had before.
- RST writer: Ensure no blank line after def in definition list (#992).
- Markdown writer: Don't use tilde code blocks with braced attributes in
markdown_github
output. A consequence of this change is that the backtick form will be preferred in general if both are enabled. That is good, as it is much more widespread than the tilde form. (#1084) - Docx writer: Fixed problem with some modified reference docx files. Include
word/_rels/settings.xml.rels
if it exists, as well as otherrels
files besides the ones pandoc generates explicitly. - HTML writer:
- With
--toc
, headers no longer link to themselves (#1081). - Omit footnotes from TOC entries. Otherwise we get doubled footnotes when headers have notes!
- EPUB writer:
- Avoid duplicate notes when headings contain notes. This arose because the headings are copied into the metadata "title" field, and the note gets rendered twice. We strip the note now before putting the heading in "title".
- Strip out footnotes from toc entries.
- Fixed bug with
--epub-stylesheet
. Now the contents ofwriterEpubStylesheet
(set by--epub-stylesheet
) should again work, and take precedence over a stylesheet specified in the metadata. Text.Pandoc.Pretty
: Addednestle
. API change.Text.Pandoc.MIME
: Addedwmf
,emf
.Text.Pandoc.Shared
:fetchItem
now handles image URLs beginning with//
.Text.Pandoc.ImageSize
: Parse EXIF format JPEGs. Previously we could only get size information for JFIF format, which led to squished images in Word documents. Closes #976.- Removed old
MarkdownTest_1.0.3
directory (#1104).
pandoc 1.12.2.1
- Markdown reader: Fixed regression in list parser, involving continuation lines containing raw HTML (or even verbatim raw HTML).
pandoc 1.12.2
- Much improved citation support.
- Metadata may now be included in YAML blocks in a markdown document. For example,
---
title:
- type: main
text: My Book
- type: subtitle
text: An investigation of metadata
creator:
- role: author
text: John Smith
- role: editor
text: Sarah Jones
identifier:
- scheme: DOI
text: doi:10.234234.234/33
publisher: My Press
rights: (c) 2007 John Smith, CC BY-NC
cover-image: img/mypic.jpg
...
Metadata may still be provided using --epub-metadata
; it will be merged with the metadata in YAML blocks.
- EPUB writer:
meta
tags are now used instead ofopf
attributes for EPUB3.- Insert "svg" property as needed in opf (EPUB 3).
- Simplify
imageTypeOf
usinggetMimeType
. - Add properties attribute to
cover-image
item for EPUB 3. - Don't include node for
cover.xhtml
if no cover! - Ensure that same identifier is used throughout (#1044). If an identifier is given in metadata, we use that; otherwise we generate a random uuid.
- Add cover reference to guide element (EPUB 2) (Shaun Attfield). Fixes an issue with Calibre putting the cover at the end of the book if the spine has
linear="no"
. Apparently this is best practice for other converters as well: http://www.idpf.org/epub/20/spec/OPF_2.0.1_draft.htm#Section2.6. - Allow
stylesheet
in metadata. The value is a path to the stylesheet. - Allow partial dates:
YYYY
,YYYY-MM
. - Markdown writer: Fix rendering of tight sublists (#1050). Previously a spurious blank line was included after a tight sublist.
- ODT writer: Add
draw:name
attribute todraw:frame
elements (#1069). This is reported to be necessary to avoid an error from recent versions of Libre Office when files contain more than one image Thanks to wmanley for reporting and diagnosing the problem. - ConTeXt writer: Don't hardcode figure/table placement and numbering. Instead, let this be set in the template, using
\setupfloat
. Thanks to on4aa and Aditya Mahajan for the suggestion (#1067). - Implemented CSL flipflopping spans in DOCX, LaTeX, and HTML writers.
- Fixed bug with markdown intraword emphasis. Closes #1066.
- Docbook writer: Hierarchicalize block content in metadata. Previously headers just disappeared from block-level metadata when it was used in templates. Now we apply the 'hierarchicalize' transformation. Note that a block headed by a level-2 header will turn into a
<sect1>
element. - OpenDocument writer: Skip raw HTML (#1035). Previously it was erroneously included as verbatim text.
- HTML/EPUB writer, footnotes: Put
<sup>
tag inside<a>
tags. This allows better control of formatting, since the<a>
tags have a distinguishing class (#1049). - Docx writer:
- Use mime type info returned by fetchItem.
- Fixed core metadata (#1046). Don't create empty date nodes if no date given. Don't create multiple
dc:creator
nodes; instead separate by semicolons. - Fix URL for core-properties in
_rels/.rels
(#1046). - Plain writer: don't print
<span>
tags. - LaTeX writer:
- Fix definition lists with internal links in terms (#1032). This fix puts braces around a term that contains an internal link, to avoid problems with square brackets.
- Properly escape pdftitle, pdfauthor (#1059).
- Use booktabs package for tables (thanks to Jose Luis Duran).
- Updated beamer template. Now references should work properly (in a slide) when
--biblatex
or--natbib
is used. - LaTeX reader:
- Parse contents of curly quotes or matched
"
as quotes. - Support
\textnormal
as span with classnodecor
. This is needed for pandoc-citeproc. - Improved citation parsing. This fixes a run-time error that occured with
\citet{}
(empty list of keys). It also ensures that empty keys don't get produced. - MediaWiki reader: Add automatic header identifiers.
- HTML reader:
- Use pandoc
Div
andSpan
for raw<div>
,<span>
when--parse-raw
. - Recognize
svg
tags as block level content (thanks to MinRK). - Parse LaTeX math if appropriate options are set.
- Markdown reader:
- Yaml block must start immediately after
---
. If there's a blank line after---
, we interpreted it as a horizontal rule. - Correctly handle empty bullet list items.
- Stop parsing "list lines" when we hit a block tag. This fixes exponential slowdown in certain input, e.g. a series of lists followed by
</div>
. - Slides: Preserve
<div class="references">
in references slide. Text.Pandoc.Writer.Shared
:- Fixed bug in
tagWithAttrs
. A space was omitted before key-value attributes, leading to invalid HTML. normalizeDate
: Allow dates with year only (thanks to Shaun Attfield).- Fixed bug in
openURL
withdata:
URIs. Previously the base-64 encoded bytestring was returned. We now decode it so it's a proper image! - DocBook reader: Handle numerical attributes starting with decimal. Also use
safeRead
instead ofread
. Text.Pandoc.Parsing
:- Generalized type of
registerHeader
, using new type classesHasReadeOptions
,HasIdentifierList
,HasHeaderMap
. These allow certain common functions to be reused even in parsers that use custom state (instead ofParserState
), such as the MediaWiki reader. - Moved inlineMath, displayMath from Markdown reader to Parsing. Generalize their types and export them from Parsing. (API change.)
Text.Pandoc.Readers.TexMath
: ExportreadTeXMath'
, which attends to display/inline. DeprecatereadTeXMath
, and usereadTeXMath'
in all the writers. Requiretexmath >= 0.6.5.2
.Text.Pandoc.MIME
:- Add entry for
jfif
. - In looking up extensions, drop the encoding info. E.g. for 'image/jpg;base64' we should lookup 'image/jpg'.
- Templates: Changed how array variables are resolved. Previously if
foo
is an array (which might be because multiple values were set on the command line),$foo$
would resolve to the concatenation of the elements of foo. This is rarely useful behavior. It has been changed so that the first value is rendered. Of course, you can still iterate over the values using$for(foo)$
. This has the result that you can override earlier settings using-V
by putting new values later on the command line, which is useful for many purposes. Text.Pandoc
: Don't default topandocExtensions
for all writers.- Allow "epub2" as synonym for "epub", "html4" for "html".
- Don't look for slidy files in data files with
--self-contained
. - Allow
https:
command line arguments to be downloaded. - Fixed
make_osx_package.sh
so data files embedded inpandoc-citeproc
.
Pandoc 1.12.1
This release fixes a few bugs (details in the release notes) and considerably improves citation handling. The new pandoc-citeproc (0.1.2.1) can read bibtex and biblatex files directly (not using bibutils to translate them, as before). Though this is still a work in progress, the results are already significantly better. LaTeX constructs in bibtex fields are translated properly, and simple math gets converted.
Three changes of note:
- The default JSON serialization format has changed. Instead of
{"Str": "foo"}
, for example, we now have{"t": "Str", "c": "foo"}
("t" for tag, "c" for contents). This new format is easier to work with outside of Haskell. This change should only affect people who are interacting with pandoc's JSON using languages other than Haskell, since in Haskell the JSON conversions can be handled automatically by the aeson library. Those who use the python library pandocfilters for filters should upgrade to version 1.2, which has already been updated to use the new format. - Pandoc's data files no longer include the javascript, CSS, and images for S5, slidy, and slideous slide formats. If you wish to produce S5 or slideous slides with the
--self-contained
option, you'll need to download the appropriate code into thes5
orslideous
directories, respectively, as withrevealjs
. (This is what the User's Guide has said to do for some time.) The default forslidy
is to embed a link to the code on the slidy website, so nothing should change for slidy users who are using the default template. - You can now create "speaker notes" in slide formats, by putting them inside
<div class="notes">
tags. (Note that you may need to leave a blank line before the closing</div>
tag in some contexts.) Currently these are supported in beamer (where the notes go to\note{...}
) and revealjs (where they turn into<aside class="notes">
; in other formats the speaker notes are just ignored.
Added 2013-10-21: The OSX package installer that was originally attached to this release was defective. Please download the new installer below, pandoc-1.12.1-1.dmg
.