-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plain text writer does not preserve line breaks #10007
Comments
It may be worth noting that I'm using the plain text writer in a Lua filter as a whitespace-preserving version of the -- element contains raw markdown to parse
local s = pandoc.write(pandoc.Pandoc(element), "plain") -- write element AST to plain text
local b = pandoc.read(s, "markdown").blocks -- read text as markdown If there's a more idomatic way to do this, please let me know! |
This is because blank lines mean "new paragraph." You can always include a RawInline or RawBlock element with whatever you want, and it will pass directly through to the output format as long as the format of the RawINline/Block matches... |
White space won't be preserved anyway; e.g. consecutive spaces would be collapsed. |
I had a similar request: #9663 It would be nice if Pandoc was less aggressive in reformating markdown.
|
I'd fully expect this whenever a writer's format specifies it (e.g. markdown), but is this true for "plain" text? I presumed the plain text writer would output text without applying any format's semantics or production rules. I tend to think this should be the case; and if it intentionally isn't, the semantics attached to "plain" text should be documented.
This depends on the reader and the specification it implements. The JSON snippet above was produced by Pandoc's CSV reader. As of fa01764, the CSV reader preserves multiple consecutive |
Our
Block-level formatting has to happen somehow. How do we represent a paragraph in plain text? A heading? Lists? Think of Multiple consecutive LineBreaks can be rendered in some formats, e.g. HTML, and even in pandoc's markdown, which has the BACKSLASH+NEWLINE way of writing a line break. But not in formats (like |
As I hinted, though, there's an easy workaround with a Lua filter. Just have the filter replace LineBreak elements with |
Thanks, @jgm. I think this would be good to document.
This description of the format in the manual would clarify nicely. |
This should certainly work, and I appreciate the workaround. I do think a function for getting the text of the AST while preserving whitespace would be generally useful in Lua filters. I opened a separate issue (#10015) to propose that. |
Preserving whitespace is not possible in a filter, because the whitespace information is often lost in the parsers. |
It's possible for a reader to encode whitespace in the AST; and depending on the language/format specification the reader implements, this may be required for correctness. #9797 was an instance of this in a built-in reader. And custom readers could certainly introduce others. |
The plain text writer coalesces sequential
LineBreak
s into a single new line.The following example:
produces this output:
where I would expect this output:
This is similar to the CSV reader issue (#9797) fixed in fa01764.
The text was updated successfully, but these errors were encountered: