Skip to content

Releases: jgm/pandoc

pandoc 1.13.2

20 Dec 08:43
@jgm jgm
Compare
Choose a tag to compare

This is mainly a spit-and-polish release, though there is one new reader and some minor new features. Note that, for the first time, we are providing a linux binary (64-bit Debian/Ubuntu).

  • TWiki Reader: add new new twiki reader (API chaneg, Alexander Sulfrian).
  • Markdown reader:
  • Better handling of paragraph in div (#1591). Previously text that ended a div would be parsed as Plain unless there was a blank line before the closing div tag.
  • Don't treat a citation as a reference link label (#1763).
  • Fix autolinks with following punctuation (#1811). The price of this is that autolinked bare URIs can no longer contain > characters, but this is not a big issue.
  • Fix Ext_lists_without_preceding_blankline bug (#1636, Artyom).
  • Allow startnum to work without fancy_lists. Formerly pandoc -f markdown-fancy_lists+startnum did not work properly.
  • RST reader (all Daniel Bergey):
  • Parse quoted literal blocks (#65). RST quoted literal blocks are the same as indented literal blocks (which pandoc already supports) except that the quote character is preserved in each line.
  • Parse RST class directives. The class directive accepts one or more class names, and creates a Div value with those classes. If the directive has an indented body, the body is parsed as the children of the Div. If not, the first block folowing the directive is made a child of the Div. This differs from the behavior of rst2xml, which does not create a Div element. Instead, the specified classes are applied to each child of the directive. However, most Pandoc Block constructors to not take an Attr argument, so we can't duplicate this behavior.
  • Warn about skipped directives.
  • Literal role now produces Code. Code role should have "code" class.
  • Improved support for custom roles
    - Added sourceCode to classes for :code: role, and anything inheriting from it.
    - Add the name of the custom role to classes if the Inline constructor supports Attr.
    - If the custom role directive does not specify a parent role, inherit from the :span: role.

This differs somewhat from the rst2xml.py behavior. If a custom role inherits from another custom role, Pandoc will attach both roles' names as classes. rst2xml.py will only use the class of the directly invoked role (though in the case of inheriting from a :code: role with a :language: defined, it will also provide the inherited language as a class).

  • Warn about ignored fields in role directives.
  • LaTeX reader:
  • Parse label after caption into a span instead of inserting an additional paragraph of bracketed text (#1747).
  • Parse math environments as inline when possible (#1821).
  • Better handling of \noindent and \greektext (#1783).
  • Handle \texorpdfstring more gracefully.
  • Handle \cref and \sep (Wikiwide).
  • Support \smartcite and \Smartcite from biblatex.
  • HTML reader:
  • Retain display type of MathML output (#1719, Matthew Pickering).
  • Recognise <br> tags inside <pre> blocks (#1620, Matthew Pickering).
  • Make embed tag either block or inline (#1756).
  • DocBook reader:
  • Handle keycombo, keycap (#1815).
  • Get string content in inner tags for literal elements (#1816).
  • Handle menuchoice elements better, with a > between (#1817).
  • Include id on section headers (#1818).
  • Document/test "type" as implemented (Brian O'Sullivan).
  • Add support for calloutlist and callout (Brian O'Sullivan). We treat a calloutlist as a bulleted list. This works well in practice.
  • Add support for classname (Bryan O'Sullivan).
  • Docx reader:
  • Fix window path for image lookup (Jesse Rosenthal). Don't use os-sensitive "combine", since we always want the paths in our zip-archive to use forward-slashes.
  • Single-item headers in ordered lists are headers (Jesse Rosenthal). When users number their headers, Word understands that as a single item enumerated list. We make the assumption that such a list is, in fact, a header.
  • Rewrite rewriteLink to work with new headers (Jesse Rosenthal). There could be new top-level headers after making lists, so we have to rewrite links after that.
  • Use polyglot header list (Jesse Rosenthal). We're just keeping a list of header formats that different languages use as their default styles. At the moment, we have English, German, Danish, and French. We can continue to add to this. This is simpler than parsing the styles file, and perhaps less error-prone, since there seems to be some variations, even within a language, of how a style file will define headers.
  • Remove header class properly in other langs (Jesse Rosenthal). When we encounter one of the polyglot header styles, we want to remove that from the par styles after we convert to a header. To do that, we have to keep track of the style name, and remove it appropriately.
  • Account for external link URLs with anchors. Previously, if a URL had an anchor, the reader would incorrectly identify it as an internal link and return only the anchor as URL. (Caleb McDaniel)
  • Fix for Issue #1692 (i18n styles) (Nikolay Yakimov).
  • Org reader:
  • Added state changing blanklines (Jesse Rosenthal). This allows us to emphasize at the beginning of a new paragraph (or, in general, after blank lines).
  • Fixed bug with bulleted lists:
    - a
    - b
    - c

was being parsed as a list, even though an unindented * should make a heading. See http://orgmode.org/manual/Plain-lists.html#fn-1.

  • Org reader: absolute, relative paths in link (#1741, Albert Krewinkel). The org reader was too restrictive when parsing links; some relative links and links to files given as absolute paths were not recognized correctly.
  • Org reader: allow empty links (jgm/gitit#471, Albert Krewinkel). This is important for use in gitit, which uses empty links for wikilinks.
  • Respect indent when parsing Org bullet lists (#1650, Timothy Humphries). Fixes issue with top-level bullet list parsing.
  • Fix indent issue for definition lists (Timothy Humphries, see #1650, #1698, #1680).
  • Parse multi-inline terms correctly in definition list (#1649, Matthew Pickering).
  • Fix rules for emphasis recognition (Albert Krewinkel). Things like /hello,/ or /hi'/ were falsy recognized as emphasised strings. This is wrong, as , and ' are forbidden border chars and may not occur on the inner border of emphasized text.
  • Drop COMMENT document trees (Albert Krewinkel). Document trees under a header starting with the word COMMENT are comment trees and should not be exported. Those trees are dropped silently (#1678).
  • Properly handle links to file:target (Albert Krewinkel). Org links like [[file:target][title]] were not handled correctly, parsing the link target verbatim. The org reader is changed such that the leading file: is dropped from the link target (see #756, #1812).
  • Parse LaTeX-style MathML entities (#1657, Albert Krewinkel). Org supports special symbols which can be included using LaTeX syntax, but are actually MathML entities. Examples for this are \nbsp (non-breaking space), \Aacute (the letter A with accent acute) or \copy (the copyright sign ©)
  • EPUB reader:
  • URI handling improvements. Now we outsource most of the work to fetchItem'. Also, do not include queries in file extensions (#1671).
  • LaTeX writer:
  • Use \texorpdfstring for section captions when needed (Vaclav Zeman).
  • Handle consecutive linebreaks (#1733).
  • Protect graphics in headers (Jesse Rosenthal). Graphics in \section/\subsection etc titles need to be \protected.
  • Put ~ before header in list item text (Jesse Rosenthal). Because of the built-in line skip, LaTeX can't handle a section header as the first element in a list item.
  • Avoid using reserved characters as \lstinline delimiters (#1595).
  • Better handling of display math in simple tables (#1754). We convert display math to inline math in simple tables, since LaTeX can't deal with display math in simple tables.
  • Escape spaces in code (#1694, Bjorn Buckwalter).
  • MediaWiki writer:
  • Fixed links with URL = text. Previously these were rendered as bare words, even if the URL was not an absolute URL (#1825).
  • ICML writer:
  • Don't force all citations into footnotes.
  • RTF writer:
  • Add blankline at end of output (#1732, Matthew Pickering).
  • RST writer:
  • Ensure blank line after figure.
  • Avoid exces whitespace after last list item (#1777).
  • Wrap line blocks with spaces before continuations (#1656).
  • Fixed double-rendering of footnotes in RST tables (#1769).
  • DokuWiki writer:
  • Better handling of block quotes. This change ensures that multiple paragraph blockquotes are rendered using native > rather than as HTML (#1738).
  • Fix external images (#1739). Preface relative links with ":", absolute URIs without. (Timothy Humphries)
  • HTML writer:
  • Use protocol-relative URL for mathjax.
  • Put newline btw img and caption paragraph.
  • MathML now outputted with tex annotation (#1635, Matthew Pickering).
  • Add support for KaTeX HTML math (#1626, Matthew Pickering). This adds KaTeX to HTMLMathMethod (API change).
  • Don't double render when email-obfuscation=none (#1625, Matthew Pickering).
  • Make header attributes work outside top level (#1711). Previously they only appeared on top level header elements. Now they work e.g. in blockquotes.
  • ODT writer:
  • Correctly handle images without extensions (#1729).
  • Strip querystring in ODT write (#1682, Todd Sifleet).
  • FB2 writer:
  • Add newline to output.
  • EPUB writer:
  • Don't add sourceURL to absolute URIs (#1669).
  • Don't use unsupported opf:title-type for epub2.
  • Include "landmarks" section in nav document for epub3 (#1757).
  • Removed playOrder from navpoint elements in ncx ...
Read more

pandoc 1.13.1

31 Aug 06:23
@jgm jgm
Compare
Choose a tag to compare
  • Fixed --self-contained with Windows paths (#1558). Previously C:\foo.js was being wrongly interpreted as a URI.
  • HTML reader: improved handling of tags that can be block or inline. Previously a section like this would be enclosed in a paragraph, with RawInline for the video tags (since video is a tag that can be either block or inline):
<video controls="controls">
   <source src="../videos/test.mp4" type="video/mp4" />
   <source src="../videos/test.webm" type="video/webm" />
   <p>
      The videos can not be played back on your system.<br/>
      Try viewing on Youtube (requires Internet connection):
      <a href="http://youtu.be/etE5urBps_w">Relative Velocity on
Youtube</a>.
   </p>
</video>

This change will cause the video and source tags to be parsed as RawBlock instead, giving better output. The general change is this: when we're parsing a "plain" sequence of inlines, we don't parse anything that COULD be a block-level tag.

  • Docx reader:
  • Be sensitive to user styles. Note that "Hyperlink" is "blacklisted," as we don't want the default underline styling to be inherited by all links by default (Jesse Rosenthal).
  • Read single paragraph in table cell as Plain (Jesse Rosenthal). This makes to docx reader's native output fit with the way the markdown reader understands its markdown output.
  • Textile writer: Extended the range of cases where native textile tables will be used (as opposed to raw HTML): we now handle any alignment type, but only for simple tables with no captions.
  • Txt2Tags reader:
  • Header is now parsed only if standalone flag is set (Matthew Pickering).
  • The header is now parsed as meta information. The first line is the title, the second is the author and third line is the date (Matthew Pickering).
  • Corrected formatting of %%mtime macro (Matthew Pickering).
  • Fixed crash when reading from stdin.
  • EPUB writer: Don't use page-progression-direction in EPUB2, which doesn't support it. Also, if page-progression-direction not specified in metadata, don't include the attribute even in EPUB3; not including it is the same as including it with the value "default", as we did before. (#1550)
  • Org writer: Accept example lines with indentation at the beginning (Calvin Beck).
  • DokuWiki writer:
  • Refactor to use Reader monad (Matthew Pickering).
  • Avoid using raw HTML in table cells; instead, use \\ instead of newlines (Jesse Rosenthal).
  • Properly handle HTML table cell alignments, and use spacing to make the tables look prettier (#1566).
  • Docx writer:
  • Bibliography entries get Bibliography style (#1559).
  • Implement change tracking (Jesse Rosenthal).
  • LaTeX writer:
  • Fixed a bug that caused a table caption to repeat across all pages (Jose Luis Duran).
  • Improved vertical spacing in tables and made it customizable using standard lengths set by booktab. See https://groups.google.com/forum/#!msg/pandoc-discuss/qMu6_5lYy0o/ZAU7lzAIKw0J (Jose Luis Duran).
  • Added \strut to fix spacing in multiline tables (Jose Luis Duran).
  • Use \tabularnewline instead of \\ in table cells (Jose Luis Duran).
  • Made horizontal rules more flexible (Jose Luis Duran).
  • Text.Pandoc.MIME:
  • Added MimeType (type synonym for String) and getMimeTypeDef. Code cleanups (Artyom Kazak).
  • Templates:
  • LaTeX template: disable microtype protrusion for typewriter font (#1549, thanks lemzwerg).
  • Improved OSX build procedure.
  • Added network-uri flag, to deal with split of network-uri from network.
  • Fix build dependencies for the trypandoc flag, so that they are ignored if trypandoc flag is set to False (Gabor Pali).
  • Updated README to remove outdated claim that --self-contained looks in the user data directory for missing files.

pandoc 1.13.0.1

18 Aug 01:11
@jgm jgm
Compare
Choose a tag to compare

This release fixes a couple of serious regressions in 1.13.

  • Docx writer:
    • Fixed regression which bungled list numbering (#1544), causing all lists to appear as basic ordered lists.
    • Include row width in table rows (Christoffer Ackelman, Viktor Kronvall). Added a property to all table rows where the sum of column widths is specified in pct (fraction of 5000). This helps persuade Word to lay out the table with the widths we specify.
  • Fixed a bug in Windows 8 which caused pandoc not to find the pandoc-citeproc filter (#1542).
  • Docx reader: miscellaneous under-the-hood improvements (Jesse Rosenthal). Most significantly, the reader now uses Builder, leading to some performance improvements.
  • HTML reader: Parse appropriately styled span as SmallCaps.
  • Markdown writer: don't escape $, ^, ~ when tex_math_dollars, superscript, and subscript extensions, respectively, are deactivated (#1127).
  • Added trypandoc flag to build CGI executable used in the online demo.
  • Makefile: Added 'quick', 'osxpkg' targets.
  • Updated README in templates to indicate templates license. The templates are dual-licensed, BSD3 and GPL2+.

pandoc 1.13

16 Aug 04:06
@jgm jgm
Compare
Choose a tag to compare

New features

  • Added docx as an input format (Jesse Rosenthal). The docx reader includes conversion of native Word equations to pandoc LaTeX Math elements. Metadata is taken from paragraphs at the beginning of the document with styles Author, Title, Subtitle, Date, and Abstract.
  • Added epub as an input format (Matthew Pickering). The epub reader includes conversion of MathML to pandoc LaTeX Math elements.
  • Added t2t (Txt2Tags) as an input format (Matthew Pickering). Txt2tags is a lightweight markup format described at http://txt2tags.org/.
  • Added dokuwiki as an output format (Clare Macrae).
  • Added haddock as an output format.
  • Added --extract-media option to extract media contained in a zip container (docx or epub) while adjusting image paths to point to the extracted images.
  • Added a new markdown extension, compact_definition_lists, that restores the syntax for definition lists of pandoc 1.12.x, allowing tight definition lists with no blank space between items, and disallowing lazy wrapping. (See below under behavior changes.)
  • Added an extension epub_html_exts for parsing HTML in EPUBs.
  • Added extensions native_spans and native_divs to activate parsing of material in HTML span or div tags as Pandoc Span inlines or Div blocks.
  • --trace now works with the Markdown, HTML, Haddock, EPUB, Textile, and MediaWiki readers. This is an option intended for debugging parsing problems; ordinary users should not need to use it.

Behavior changes

  • Changed behavior of the markdown_attribute extension, to bring it in line with PHP markdown extra and multimarkdown. Setting markdown="1" on an outer tag affects all contained tags, recursively, until it is reversed with markdown="0" (#1378).
  • Revised markdown definition list syntax (#1429). Both the reader and writer are affected. This change brings pandoc's definition list syntax into alignment with that used in PHP markdown extra and multimarkdown (with the exception that pandoc is more flexible about the definition markers, allowing tildes as well as colons). Lazily wrapped definitions are now allowed. Blank space is required between list items. The space before a definition is used to determine whether it is a paragraph or a "plain" element. WARNING: This change may break existing documents! Either check your documents for definition lists without blank space between items, or use markdown+compact_definition_lists for the old behavior.
  • .numberLines now works in fenced code blocks even if no language is given (#1287, jgm/highlighting-kate#40).
  • Improvements to --filter:
  • Don't search PATH for a filter with an explicit path. This fixed a bug wherein --filter ./caps.py would run caps.py from the system path, even if there was a caps.py in the working directory.
  • Respect shebang if filter is executable (#1389).
  • Don't print misleading error message. Previously pandoc would say that a filter was not found, even in a case where the filter had a syntax error.
  • HTML reader:
  • Parse div and span elements even without --parse-raw, provided native_divs and native_spans extensions are set. Motivation: these now generate native pandoc Div and Span elements, not raw HTML.
  • Parse EPUB-specific elements if the epub_html_exts extension is enabled. These include switch, footnote, rearnote, noteref.
  • Org reader:
  • Support for inline LaTeX. Inline LaTeX is now accepted and parsed by the org-mode reader. Both math symbols (like \tau) and LaTeX commands (like \cite{Coffee}), can be used without any further escaping (Albert Krewinkel).
  • Textile reader and writer:
  • The raw_tex extension is no longer set by default. You can enable it with textile+raw_tex.
  • DocBook reader:
  • Support equation, informalequation, inlineequation elements with mml:math content. This is converted into LaTeX and put into a Pandoc Math inline.
  • Revised plain output, largely following the style of Project Gutenberg:
  • Emphasis is rendered with _underscores_, strong emphasis with ALL CAPS.
  • Headings are rendered differently, with space to set them off, not with setext style underlines. Level 1 headers are ALL CAPS.
  • Math is rendered using unicode when possible, but without the distracting emphasis markers around variables.
  • Footnotes use a regular [n] style.
  • Markdown writer:
  • Horizontal rules are now a line across the whole page.
  • Prettier pipe tables. Columns are now aligned (#1323).
  • Respect the raw_html extension. pandoc -t markdown-raw_html no longer emits any raw HTML, including span and div tags generated by Span and Div elements.
  • Use span with style for SmallCaps (#1360).
  • HTML writer:
  • Autolinks now have class uri, and email autolinks have class email, so they can be styled.
  • Docx writer:
  • Document formatting is carried over from reference.docx. This includes margins, page size, page orientation, header, and footer, including images in headers and footers.
  • Include abstract (if present) with Abstract style (#1451).
  • Include subtitle (if present) with Subtitle style, rather than tacking it on to the title (#1451).
  • Org writer:
  • Write empty span elements with an id attribute as org anchors. For example Span ("uid",[],[]) [] becomes <<uid>>.
  • LaTeX writer:
  • Put table captions above tables, to match the conventional standard. (Previously they appeared below tables.)
  • Use \(..\) instead of $..$ for inline math (#1464).
  • Use \nolinkurl in email autolinks. This allows them to be styled using \urlstyle{tt}. Thanks to Ulrike Fischer for the solution.
  • Use \textquotesingle for ' in inline code. Otherwise we get curly quotes in the PDF output (#1364).
  • Use \footnote<.>{..} for notes in beamer, so that footnotes do not appear before the overlays in which their markers appear (#1525).
  • Don't produce a \label{..} for a Div or Span element. Do produce a \hyperdef{..} (#1519).
  • EPUB writer:
  • If the metadata includes page-progression-direction (which can be ltr or rtl, the page-progression-direction attribute will be set in the EPUB spine (#1455).
  • Custom lua writers:
  • Custom writers now work with --template.
  • Removed HTML header scaffolding from sample.lua.
  • Made citation information available in lua writers.
  • --normalize and Text.Pandoc.Shared.normalize now consolidate adjacent RawBlocks when possible.

API changes

  • Added Text.Pandoc.Readers.Docx, exporting readDocx (Jesse Rosenthal).
  • Added Text.Pandoc.Readers.EPUB, exporting readEPUB (Matthew Pickering).
  • Added Text.Pandoc.Readers.Txt2Tags, exporting readTxt2Tags (Matthew Pickering).
  • Added Text.Pandoc.Writers.DokuWiki, exporting writeDokuWiki (Clare Macrae).
  • Added Text.Pandoc.Writers.Haddock, exporting writeHaddock.
  • Added Text.Pandoc.MediaBag, exporting MediaBag, lookupMedia, insertMedia, mediaDirectory, extractMediaBag. The docx and epub readers return a pair of a Pandoc document and a MediaBag with the media resources they contain. This can be extracted using --extract-media. Writers that incorporate media (PDF, Docx, ODT, EPUB, RTF, or HTML formats with --self-contained) will look for resources in the MediaBag generated by the reader, in addition to the file system or web.
  • Text.Pandoc.Readers.TexMath: Removed deprecated readTeXMath. Renamed readTeXMath' to texMathToInlines.
  • Text.Pandoc: Added Reader data type (Matthew Pickering). readers now associates names of readers with Reader structures. This allows inclusion of readers, like the docx reader, that take binary rather than textual input.
  • Text.Pandoc.Shared:
  • Added capitalize (Artyom Kazak), and replaced uses of map toUpper (which give bad results for many languages).
  • Added collapseFilePath, which removes intermediate . and .. from a path (Matthew Pickering).
  • Added fetchItem', which works like fetchItem but searches a MediaBag before looking on the net or file system.
  • Added withTempDir.
  • Added removeFormatting.
  • Added extractSpaces (from HTML reader) and generalized its type so that it can be used by the docx reader (Matthew Pickering).
  • Added ordNub.
  • Added normalizeInlines, normalizeBlocks.
  • normalize is now Pandoc -> Pandoc instead of Data a :: a -> a. Some users may need to change their uses of normalize to the newly exported normalizeInlines or normalizeBlocks.
  • Text.Pandoc.Options:
  • Added writerMediaBag to WriterOptions.
  • Removed deprecated and no longer used readerStrict in ReaderOptions. This is handled by readerExtensions now.
  • Added Ext_compact_definition_lists.
  • Added Ext_epub_html_exts.
  • Added Ext_native_divs and Ext_native_spans. This allows users to turn off the default pandoc behavior of parsing contents of div and span tags in markdown and HTML as native pandoc Div blocks and Span inlines.
  • Text.Pandoc.Parsing:
  • Generalized readWith to readWithM (Matthew Pickering).
  • Export runParserT and Stream (Matthew Pickering).
  • Added HasQuoteContext type class (Matthew Pickering).
  • Generalized types of mathInline, smartPunctuation, quoted, singleQuoted, doubleQuoted, failIfInQuoteContext, applyMacros (Matthew Pickering).
  • Added custom token (Matthew Pickering).
  • Added stateInHtmlBlock to ParserState. This is used to keep track of the ending tag we're waiting for when we're parsing inside HTML block tags.
  • Added stateMarkdownAttribute to ParserState. This is used to keep track of whether the markdown attribute has been set in an enclosing tag.
  • Generalized type of registerHeader, using new type classes HasReaderOptions, `...
Read more

pandoc 1.12.4.2

14 May 22:06
@jgm jgm
Compare
Choose a tag to compare
  • Require highlighting-kate >= 0.5.8. Fixes a performance regression.
  • Shared: addMetaValue now behaves slightly differently: if both the new and old values are lists, it concatenates their contents to form a new list.
  • LaTeX reader:
  • Set bibliography in metadata from \bibliography or \addbibresource command.
  • Don't error on %foo with no trailing newline.
  • Org reader:
  • Support code block headers (#+BEGIN_SRC ...) (Albert Krewinkel).
  • Fix parsing of blank lines within blocks (Albert Krewinkel).
  • Support pandoc citation extension (Albert Krewinkel). This can be turned off by specifying org-citation as the input format.
  • Markdown reader:
  • citeKey moved to Text.Pandoc.Parsing so it can be used by other readers (Albert Krewinkel).
  • Text.Pandoc.Parsing:
  • Added citeKey (see above).
  • Added HasLastStrPosition type class and updateLastStrPos and notAfterString functions.
  • Updated copyright notices (Albert Krewinkel).
  • Added default.icml to data files so it installs with the package.
  • OSX package:
  • The binary is now built with options to ensure that it can be used with OSX 10.6+.
  • Moved OSX package materials to osx directory.
  • Added OSX package uninstall script, included in the zip container (thanks to Daniel T. Staal).

pandoc 1.12.4

08 May 07:26
@jgm jgm
Compare
Choose a tag to compare
  • Made it possible to run filters that aren't executable (#1096).
    Pandoc first tries to find the executable (searching the path
    if path isn't given). If it fails, but the file exists and has
    a .py, .pl, .rb, .hs, or .php extension, pandoc runs the filter
    using the appropriate interpreter. This should make it easier to
    use filters on Windows, and make it more convenient for everyone.
  • Added Emacs org-mode reader (Albert Krewinkel).
  • Added InDesign ICML Writer (mb21).
  • MediaWiki reader:
    • Accept image links in more languages (Jaime Marquínez Ferrándiz).
    • Fixed bug in certain nested lists (#1213). If a level 2 list was
      followed by a level 1 list, the first item of the level 1 list
      would be lost.
    • Handle table rows containing just an HTML comment (#1230).
  • LaTeX reader:
    • Give better location information on errors, pointing to line
      numbers within included files (#1274).
    • LaTeX reader: Better handling of table environment (#1204).
      Positioning options no longer rendered verbatim.
    • Better handling of figure and table with caption (#1204).
    • Handle @{} and p{length} in tabular. The length is not actually
      recorded, but at least we get a table (#1180).
    • Properly handle \nocite. It now adds a nocite metadata
      field. Citations there will appear in the bibliography but not
      in the text (unless you explicitly put a $nocite$ variable
      in your template).
  • Markdown reader:
    • Ensure that whole numbers in YAML metadata are rendered without
      decimal points. (This became necessary with changes to aeson
      and yaml libraries. aeson >= 0.7 and yaml >= 0.8.8.2 are now required.)
    • Fixed regression on line breaks in strict mode (#1203).
    • Small efficiency improvements.
    • Improved parsing of nested divs. Formerly a closing div tag
      would be missed if it came right after other block-level tags.
    • Avoid backtracking when closing </div> not found.
    • Fixed bug in reference link parsing in markdown_mmd.
    • Fixed a bug in list parsing (#1154). When reading a raw list
      item, we now strip off up to 4 spaces.
    • Fixed parsing of empty reference link definitions (#1186).
    • Made one-column pipe tables work (#1218).
  • Textile reader:
    • Better support for attributes. Instead of being ignored, attributes
      are now parsed and included in Span inlines. The output will be a bit
      different from stock textile: e.g. for *(foo)hi*, we'll get
      <em><span class="foo">hi</span></em> instead of
      <em class="foo">hi</em>. But at least the data is not lost.
    • Improved treatment of HTML spans (%) (#1115).
    • Improved link parsing. In particular we now pick up on attributes.
      Since pandoc links can't have attributes, we enclose the whole link in
      a span if there are attributes (#1008).
    • Implemented correct parsing rules for inline markup (#1175, Matthew
      Pickering).
    • Use Builder (Matthew Pickering).
  • DocBook reader:
    • Better treatment of formalpara. We now emit the title (if present)
      as a separate paragraph with boldface text (#1215).
    • Set metadata author not authors.
    • Added recognition of authorgroup and releaseinfo elements (#1214,
      Matthew Pickering).
    • Converted current meta information parsing in DocBook to a more
      extensible version which is aware of the more recent meta
      representation (Matthew Pickering).
  • HTML reader:
    • Require tagsoup 0.13.1, to fix a bug with parsing of script tags
      (#1248).
    • Treat processing instructions & declarations as block. Previously
      these were treated as inline, and included in paragraph tags in HTML
      or DocBook output, which is generally not what is wanted (#1233).
    • Updated closes with rules from HTML5 spec.
    • Use Builder (Matthew Pickering, #1162).
  • RST reader:
    • Remove duplicate http in PEP links (Albert Krewinkel).
    • Make rst figures true figures (#1168, CasperVector)
    • Enhanced Pandoc's support for rST roles (Merijn Verstaaten).
      rST parser now supports: all built-in rST roles, new role definition,
      role inheritance, though with some limitations.
    • Use author rather than authors in metadata.
    • Better handling of directives. We now correctly handle field
      lists that are indented more than three spaces. We treat an
      aafig directive as a code block with attributes, so it can be
      processed in a filter (#1212).
  • LaTeX writer:
    • Mark span contents with label if span has an ID (Albert Krewinkel).
    • Made --toc-depth work well with books in latex/pdf output (#1210).
    • Handle line breaks in simple table cells (#1217).
    • Workaround for level 4-5 headers in quotes. These previously produced
      invalid LaTeX: \paragraph or \subparagraph in a quote environment.
      This adds an mbox{} in these contexts to work around the problem.
      See http://tex.stackexchange.com/a/169833/22451 (#1221).
    • Use \/ to avoid en-dash ligature instead of -{}- (Vaclav Zeman).
      This is to fix LuaLaTeX output. The -{}- sequence does not avoid the
      ligature with LuaLaTeX but \/ does.
    • Fixed string escaping in hyperref and hyperdef (#1130).
  • ConTeXt writer: Improved autolinks (#1270).
  • DocBook writer:
    • Improve handling of hard line breaks in Docbook writer
      (Neil Mayhew). Use a <literallayout> for the entire paragraph, not
      just for the newline character.
    • Don't let line breaks inside footnotes influence the enclosing
      paragraph (Neil Mayhew).
    • Distinguish tight and loose lists in DocBook output, using
      spacing="compact" (Neil Mayhew, #1250).
  • Docx writer: When needed files are not present in the user's
    reference.docx, fall back on the versions in the reference.docx
    in pandoc's data files. This fixes a bug that occurs when a
    reference.docx saved by LibreOffice is used. (#1185)
  • EPUB writer:
    • Include extension in epub ids. This fixes a problem with duplicate
      extensions for fonts and images with the same base name but different
      extensions (#1254).
    • Handle files linked in raw img tags (#1170).
    • Handle media in audio source tags (#1170).
      Note that we now use a media directory rather than images.
    • Incorporate files linked in video tags (#1170). src and poster
      will both be incorporated into content.opf and the epub container.
  • HTML writer:
    • Add colgroup around col tags (#877). Also affects EPUB writer.
    • Fixed bug with unnumbered section headings. Unnumbered section
      headings (with class unnumbered) were getting numbers.
    • Improved detection of image links. Previously image links with
      queries were not recognized, causing <embed> to be used instead
      of <img>.
  • Man writer: Ensure that terms in definition lists aren't line wrapped
    (#1195).
  • Markdown writer:
    • Use proper escapes to avoid unwanted lists (#980). Previously we used
      0-width spaces, an ugly hack.
    • Use longer backtick fences if needed (#1206). If the content contains a
      backtick fence and there are attributes, make sure longer fences are
      used to delimit the code. Note: This works well in pandoc, but github
      markdown is more limited, and will interpret the first string of three
      or more backticks as ending the code block.
  • RST writer: Avoid stack overflow with certain tables (#1197).
  • RTF writer: Fixed table cells containing paragraphs.
  • Custom writer:
    • Correctly handle UTF-8 in custom lua scripts (#1189).
    • Fix bugs with lua scripts with mixed-case filenames and
      paths containing + or - (#1267). Note that getWriter
      in Text.Pandoc no longer returns a custom writer on input
      foo.lua.
  • AsciiDoc writer: Handle multiblock and empty table cells
    (#1245, #1246). Added tests.
  • Text.Pandoc.Options: Added readerTrace to ReaderOptions
  • Text.Pandoc.Shared:
    • Added compactify'DL (formerly in markdown reader) (Albert Krewinkel).
    • Fixed bug in toRomanNumeral: numbers ending with '9' would
      be rendered as Roman numerals ending with 'IXIV' (#1249). Thanks to
      Jesse Rosenthal.
    • openURL: set proxy with value of http_proxy env variable (#1211).
      Note: proxies with non-root paths are not supported, due to
      limitations in http-conduit.
  • Text.Pandoc.PDF:
    • Ensure that temp directories deleted on Windows (#1192). The PDF is
      now read as a strict bytestring, ensuring that process ownership will
      be terminated, so the temp directory can be deleted.
    • Use / as path separators in a few places, even on Windows.
      This seems to be necessary for texlive (#1151, thanks to Tim Lin).
    • Use ; for TEXINPUTS separator on Windows (#1151).
    • Changes to error reporting, to handle non-UTF8 error output.
  • Text.Pandoc.Templates:
    • Removed unneeded datatype context (Merijn Verstraaten).
    • YAML objects resolve to "true" in conditionals (#1133).
      Note: If address is a YAML object and you just have $address$
      in your template, the word true will appear, which may be
      unexpected. (Previously nothing would appear.)
  • Text.Pandoc.SelfContained: Handle poster attribute in video
    tags (#1188).
  • Text.Pandoc.Parsing:
    • Made F an instance of Applicative (#1138).
    • Added stateCaption.
    • Added HasMacros, simplified other typeclasses.
      Removed updateHeaderMap, setHeaderMap, getHeaderMap,
      updateIdentifierList, setIdentifierList, getIdentifierList.
    • Changed the smart punctuation parser to return Inlines
      rather than Inline (Matthew Pickering).
    • Changed HasReaderOptions, HasHeaderMap, HasIdentifierList
      from typeclasses of monads to typeclasses of states. This simplifies
      the instance definitions and provides more flexibility. Generalized
      type of getOption and added a default definition. Removed
      askReaderOption. Added extractReaderOption. Added
      extractHeaderMap and updateHeaderMap in HasHeaderMap.
      ...
Read more

pandoc 1.12.3

10 Jan 19:35
@jgm jgm
Compare
Choose a tag to compare
  • The --bibliography option now sets the biblio-files variable. So, if you're using --natbib or --biblatex, you can just use --bibliography=foo.bib instead of -V bibliofiles=foo.
  • Don't run pandoc-citeproc filter if --bibliography is used together with --natbib or --biblatex (Florian Eitel).
  • Template changes:
  • Updated beamer template to include booktabs.
  • Added abstract variable to LaTeX template.
  • Put header-includes after title in LaTeX template (#908).
  • Allow use of \includegraphics[size] in beamer. This just required porting a macro definition from the default LaTeX template to the default beamer template.
  • reference.docx: Include FootnoteText style. Otherwise Word ignores the style, even when specified in the pPr. (#901)
  • reference.odt: Tidied styles.xml.
  • Relaxed version bounds for dependencies.
  • Added withSocketsDo around http conduit code in openURL, so it works on Windows (#1080).
  • Added Cite function to sample.lua.
  • Markdown reader:
  • Fixed regression in title blocks (#1089). If author field was empty, date was being ignored.
  • Allow backslash-newline hard line breaks in grid and multiline table cells.
  • Citation keys may now start with underscores, and may contain underscores adjacent to internal punctuation.
  • LaTeX reader:
  • Add support for Verb macro (jrnold) (#1090).
  • Support babel-style quoting: "..."'`.
  • Properly handle script blocks in strict mode. (That is, markdown-markdown_in_html_blocks.) Previously a spurious <p> tag was being added (#1093).
  • Docbook reader: Avoid failure if tbody contains no tr or row elements.
  • LaTeX writer:
  • Factored out function for table cell creation.
  • Better treatment of footnotes in tables. Notes now appear in the regular sequence, rather than in the table cell. (This was a regression in 1.10.)
  • HTML reader: Parse name/content pairs from meta tags as metadata. Closes #1106.
  • Moved fixDisplayMath from Docx writer to Writer.Shared.
  • OpenDocument writer: Fixed RawInline, RawBlock so they don't escape.
  • ODT writer: Use mathml for proper rendering of formulas. Note: LibreOffice's support for this seems a bit buggy. But it should be better than what we had before.
  • RST writer: Ensure no blank line after def in definition list (#992).
  • Markdown writer: Don't use tilde code blocks with braced attributes in markdown_github output. A consequence of this change is that the backtick form will be preferred in general if both are enabled. That is good, as it is much more widespread than the tilde form. (#1084)
  • Docx writer: Fixed problem with some modified reference docx files. Include word/_rels/settings.xml.rels if it exists, as well as other rels files besides the ones pandoc generates explicitly.
  • HTML writer:
  • With --toc, headers no longer link to themselves (#1081).
  • Omit footnotes from TOC entries. Otherwise we get doubled footnotes when headers have notes!
  • EPUB writer:
  • Avoid duplicate notes when headings contain notes. This arose because the headings are copied into the metadata "title" field, and the note gets rendered twice. We strip the note now before putting the heading in "title".
  • Strip out footnotes from toc entries.
  • Fixed bug with --epub-stylesheet. Now the contents of writerEpubStylesheet (set by --epub-stylesheet) should again work, and take precedence over a stylesheet specified in the metadata.
  • Text.Pandoc.Pretty: Added nestle. API change.
  • Text.Pandoc.MIME: Added wmf, emf.
  • Text.Pandoc.Shared: fetchItem now handles image URLs beginning with //.
  • Text.Pandoc.ImageSize: Parse EXIF format JPEGs. Previously we could only get size information for JFIF format, which led to squished images in Word documents. Closes #976.
  • Removed old MarkdownTest_1.0.3 directory (#1104).

pandoc 1.12.2.1

09 Dec 03:32
@jgm jgm
Compare
Choose a tag to compare
  • Markdown reader: Fixed regression in list parser, involving continuation lines containing raw HTML (or even verbatim raw HTML).

pandoc 1.12.2

07 Dec 20:55
@jgm jgm
Compare
Choose a tag to compare
  • Much improved citation support.
  • Metadata may now be included in YAML blocks in a markdown document. For example,
---
title:
- type: main
  text: My Book
- type: subtitle
  text: An investigation of metadata
creator:
- role: author
  text: John Smith
- role: editor
  text: Sarah Jones
identifier:
- scheme: DOI
  text: doi:10.234234.234/33
publisher:  My Press
rights:  (c) 2007 John Smith, CC BY-NC
cover-image: img/mypic.jpg
...

Metadata may still be provided using --epub-metadata; it will be merged with the metadata in YAML blocks.

  • EPUB writer:
  • meta tags are now used instead of opf attributes for EPUB3.
  • Insert "svg" property as needed in opf (EPUB 3).
  • Simplify imageTypeOf using getMimeType.
  • Add properties attribute to cover-image item for EPUB 3.
  • Don't include node for cover.xhtml if no cover!
  • Ensure that same identifier is used throughout (#1044). If an identifier is given in metadata, we use that; otherwise we generate a random uuid.
  • Add cover reference to guide element (EPUB 2) (Shaun Attfield). Fixes an issue with Calibre putting the cover at the end of the book if the spine has linear="no". Apparently this is best practice for other converters as well: http://www.idpf.org/epub/20/spec/OPF_2.0.1_draft.htm#Section2.6.
  • Allow stylesheet in metadata. The value is a path to the stylesheet.
  • Allow partial dates: YYYY, YYYY-MM.
  • Markdown writer: Fix rendering of tight sublists (#1050). Previously a spurious blank line was included after a tight sublist.
  • ODT writer: Add draw:name attribute to draw:frame elements (#1069). This is reported to be necessary to avoid an error from recent versions of Libre Office when files contain more than one image Thanks to wmanley for reporting and diagnosing the problem.
  • ConTeXt writer: Don't hardcode figure/table placement and numbering. Instead, let this be set in the template, using \setupfloat. Thanks to on4aa and Aditya Mahajan for the suggestion (#1067).
  • Implemented CSL flipflopping spans in DOCX, LaTeX, and HTML writers.
  • Fixed bug with markdown intraword emphasis. Closes #1066.
  • Docbook writer: Hierarchicalize block content in metadata. Previously headers just disappeared from block-level metadata when it was used in templates. Now we apply the 'hierarchicalize' transformation. Note that a block headed by a level-2 header will turn into a <sect1> element.
  • OpenDocument writer: Skip raw HTML (#1035). Previously it was erroneously included as verbatim text.
  • HTML/EPUB writer, footnotes: Put <sup> tag inside <a> tags. This allows better control of formatting, since the <a> tags have a distinguishing class (#1049).
  • Docx writer:
  • Use mime type info returned by fetchItem.
  • Fixed core metadata (#1046). Don't create empty date nodes if no date given. Don't create multiple dc:creator nodes; instead separate by semicolons.
  • Fix URL for core-properties in _rels/.rels (#1046).
  • Plain writer: don't print <span> tags.
  • LaTeX writer:
  • Fix definition lists with internal links in terms (#1032). This fix puts braces around a term that contains an internal link, to avoid problems with square brackets.
  • Properly escape pdftitle, pdfauthor (#1059).
  • Use booktabs package for tables (thanks to Jose Luis Duran).
  • Updated beamer template. Now references should work properly (in a slide) when --biblatex or --natbib is used.
  • LaTeX reader:
  • Parse contents of curly quotes or matched " as quotes.
  • Support \textnormal as span with class nodecor. This is needed for pandoc-citeproc.
  • Improved citation parsing. This fixes a run-time error that occured with \citet{} (empty list of keys). It also ensures that empty keys don't get produced.
  • MediaWiki reader: Add automatic header identifiers.
  • HTML reader:
  • Use pandoc Div and Span for raw <div>, <span> when --parse-raw.
  • Recognize svg tags as block level content (thanks to MinRK).
  • Parse LaTeX math if appropriate options are set.
  • Markdown reader:
  • Yaml block must start immediately after ---. If there's a blank line after ---, we interpreted it as a horizontal rule.
  • Correctly handle empty bullet list items.
  • Stop parsing "list lines" when we hit a block tag. This fixes exponential slowdown in certain input, e.g. a series of lists followed by </div>.
  • Slides: Preserve <div class="references"> in references slide.
  • Text.Pandoc.Writer.Shared:
  • Fixed bug in tagWithAttrs. A space was omitted before key-value attributes, leading to invalid HTML.
  • normalizeDate: Allow dates with year only (thanks to Shaun Attfield).
  • Fixed bug in openURL with data: URIs. Previously the base-64 encoded bytestring was returned. We now decode it so it's a proper image!
  • DocBook reader: Handle numerical attributes starting with decimal. Also use safeRead instead of read.
  • Text.Pandoc.Parsing:
  • Generalized type of registerHeader, using new type classes HasReadeOptions, HasIdentifierList, HasHeaderMap. These allow certain common functions to be reused even in parsers that use custom state (instead of ParserState), such as the MediaWiki reader.
  • Moved inlineMath, displayMath from Markdown reader to Parsing. Generalize their types and export them from Parsing. (API change.)
  • Text.Pandoc.Readers.TexMath: Export readTeXMath', which attends to display/inline. Deprecate readTeXMath, and use readTeXMath' in all the writers. Require texmath >= 0.6.5.2.
  • Text.Pandoc.MIME:
  • Add entry for jfif.
  • In looking up extensions, drop the encoding info. E.g. for 'image/jpg;base64' we should lookup 'image/jpg'.
  • Templates: Changed how array variables are resolved. Previously if foo is an array (which might be because multiple values were set on the command line), $foo$ would resolve to the concatenation of the elements of foo. This is rarely useful behavior. It has been changed so that the first value is rendered. Of course, you can still iterate over the values using $for(foo)$. This has the result that you can override earlier settings using -V by putting new values later on the command line, which is useful for many purposes.
  • Text.Pandoc: Don't default to pandocExtensions for all writers.
  • Allow "epub2" as synonym for "epub", "html4" for "html".
  • Don't look for slidy files in data files with --self-contained.
  • Allow https: command line arguments to be downloaded.
  • Fixed make_osx_package.sh so data files embedded in pandoc-citeproc.

Pandoc 1.12.1

20 Oct 23:04
@jgm jgm
Compare
Choose a tag to compare

This release fixes a few bugs (details in the release notes) and considerably improves citation handling. The new pandoc-citeproc (0.1.2.1) can read bibtex and biblatex files directly (not using bibutils to translate them, as before). Though this is still a work in progress, the results are already significantly better. LaTeX constructs in bibtex fields are translated properly, and simple math gets converted.

Three changes of note:

  1. The default JSON serialization format has changed. Instead of {"Str": "foo"}, for example, we now have {"t": "Str", "c": "foo"} ("t" for tag, "c" for contents). This new format is easier to work with outside of Haskell. This change should only affect people who are interacting with pandoc's JSON using languages other than Haskell, since in Haskell the JSON conversions can be handled automatically by the aeson library. Those who use the python library pandocfilters for filters should upgrade to version 1.2, which has already been updated to use the new format.
  2. Pandoc's data files no longer include the javascript, CSS, and images for S5, slidy, and slideous slide formats. If you wish to produce S5 or slideous slides with the --self-contained option, you'll need to download the appropriate code into the s5 or slideous directories, respectively, as with revealjs. (This is what the User's Guide has said to do for some time.) The default for slidy is to embed a link to the code on the slidy website, so nothing should change for slidy users who are using the default template.
  3. You can now create "speaker notes" in slide formats, by putting them inside <div class="notes"> tags. (Note that you may need to leave a blank line before the closing </div> tag in some contexts.) Currently these are supported in beamer (where the notes go to \note{...}) and revealjs (where they turn into <aside class="notes">; in other formats the speaker notes are just ignored.

Added 2013-10-21: The OSX package installer that was originally attached to this release was defective. Please download the new installer below, pandoc-1.12.1-1.dmg.