Second pass of RST to RST conversion is not an identity #992

KrzysiekJ · 2013-09-21T16:27:37Z

Given the following input:

1. foo
 a. bar

the command pandoc -f rst -t rst gives us the following result

1. foo

    a. bar

However running the same command twice (pandoc -f rst -t rst | pandoc -f rst -t rst) gives us different indentation:

1. foo

       a. bar

Further passes don’t change the output.

Tested on pandoc 1.12.0.2.

The text was updated successfully, but these errors were encountered:

jgm · 2013-09-22T22:44:00Z

Did you try your original input with rst2html.py? It gives a warning, because space is needed between the list and the sublist. So, let's insert the blank line:

1. foo

 a. bar

Now run that through rst2html.py, and you should see this:

<ol class="arabic simple">
<li>foo</li>
</ol>
<blockquote>
<ol class="loweralpha simple">
<li>bar</li>
</ol>
</blockquote>

This is not a valid RST nested list at all, because the sublist isn't indented far enough. It gets parsed as a list followed by a blockquote containing a list. And that's how pandoc interprets it, too:

% pandoc -f rst -t native
1. foo

 a. bar
^D
[OrderedList (1,Decimal,Period)
 [[Plain [Str "foo"]]]
,BlockQuote
 [OrderedList (1,LowerAlpha,Period)
  [[Plain [Str "bar"]]]]]

Does that explain what you're seeing?

jgm · 2013-09-23T18:23:50Z

OK, there is a real bug here. Your original input

1. foo
 a. bar

is getting parsed as a definition list. That's not how rst2html.py does it, but that doesn't worry me so much since this is invalid rst anyway. The problem is that pandoc is rendering this definition list as something that does not parse as a definition list (because of the blank line between the term and the definition). The blank line is there because pandoc forces a blank line before lists, but it should not do so in this case.

KrzysiekJ · 2013-09-29T10:39:30Z

Thanks for this detailed explanation. IMHO the fact that the second pass of Pandoc is producing different output than the first may be a more general indication of inconsistency between what Pandoc thinks it renders and what is the output’s parse result. So perhaps it can be used to automatically find bugs by generating some random data.

jgm · 2013-09-29T14:42:08Z

+++ Krzysztof Jurewicz [Sep 29 13 03:39 ]:

Thanks for this detailed explanation. IMHO the fact that the second
pass of Pandoc is producing different output than the first may be a
more general indication of inconsistency between what Pandoc thinks it
renders and what is the output’s parse result. So perhaps it can be
used to automatically find bugs by generating some random data.

Yes, good idea. It would be easy to automate this with QuickCheck.

jgm closed this as completed in e7e76db Jan 3, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Second pass of RST to RST conversion is not an identity #992

Second pass of RST to RST conversion is not an identity #992

KrzysiekJ commented Sep 21, 2013

jgm commented Sep 22, 2013

jgm commented Sep 23, 2013

KrzysiekJ commented Sep 29, 2013

jgm commented Sep 29, 2013

Second pass of RST to RST conversion is not an identity #992

Second pass of RST to RST conversion is not an identity #992

Comments

KrzysiekJ commented Sep 21, 2013

jgm commented Sep 22, 2013

jgm commented Sep 23, 2013

KrzysiekJ commented Sep 29, 2013

jgm commented Sep 29, 2013