-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
un-escaped characters for asciidoc output #2337
Comments
It would be easy to escape all these special characters, but the output would likely be ugly. |
Escaping with backslashes is not that easy in asciidoc, because it is very picky about only accepting a backslash escape in exactly that cases were it would recognize a command (with exceptions), otherwise it will render a backslash literal. (I'm using asciidoctor as the reference here, I haven't tried the orginal implementation) E.g. escaping
will render a backslash for the first line:
Edit: Apparently, there is a much more reliable way to do this with passthroughs: |
I believe I have another two instances of this but with this mediawiki, input:
I have used variations of the phrase "syntax defect" as a way to sanitize and minimize the real-life source, and to illustrate the defect. Converting the file with
There are two escape issues with the output identified below with
Before the characters indicated by the
To summarize: I believe there are two separate defects broadly related to unescaped characters:
Version information:
MacOS 10.13.6, pandoc installed via homebrew. |
Asciidoc is crazy!!
asciidoctor gives you
which isn't even well-formed HTML.
you get
I have to believe this is a bug in asciidoctor and not the intended behavior. I'm not going to try to work around all these quirks. |
EVen worse, if you try to escape the
you get
The first backslash acts as an escape and the second one doesn't! |
@jgm Would it help if we opened an issue about this with the upstream project, or supported you (as the owner of this repo) in that endeavour? |
@lisa If you'd like to inquire upstream about whether this is intended behavior, and ask them to clarify the escaping rules, that would be great. |
Passthrough quotes fix this as well:
will produce the intended output. Still also a bug in asciidoctor, as the output isn't proper html. |
Escaping is also missing for these relatively simple cases:
Unfortunately I'm not sure what the "correct" output is in cases like this. According to https://asciidoctor.org/docs/asciidoc-syntax-quick-reference/#escaping-text I guess it would be |
Unfortunately escaping in asciidoc is not well designed. |
+1 for escaping, at least in URLs |
OK...but what if you want to have
which yields
in which the first backslash acts as an escape but the others don't. Argh! Asciidoc needs some clear, consistent escaping rules. |
I think this works
You can use backslashes here to escape the |
I have a local patch similar to what @mako4 suggests, but with one modification. it specifies that special character substitutions still apply; otherwise asciidoctor will pass-through special html characters into the final document:
asciidoctor allows "c" as an abbreviation for "specialcharacters". I haven't implemented that yet, but makes things very slightly less ugly. If this is applied in escapeString to only texts with the special characters in it the output isn't too ugly for prose. The alternative is to define attributes for each special character:
When there are relatively few special characters the latter looks better, when there are many the former looks better. |
So to summarize: for asciidoctor, at least, we can do
where |
@jgm no, you can't escape backslashes, which means the |
Hm, it also means that if code contains |
Just did an experiment: it looks like you can use numeric entities to escape special characters inside
output from asciidoc (original): <p><code>&#x5b;&#x30;&#x2d;&#x39;&#x5d;&#x2a;</code></p> output from asciidoctor: <p><code>[0-9]*</code></p> So that's an interesting behavior change! |
Also, I somehow missed this (maybe it's a new addition?), but this works in asciidoctor only:
Output from asciidoctor:
|
Still an issue in latest pandoc:
But should see: |
@kbroch-rivosinc Please see the comments above. The suggested output you give isn't correct. Given this input, asciidoctor yields the following HTML:
If it were just a matter of backslash-escaping all |
@jgm : here's what I see from asciidoctor (sorry I should have put this in original comment):
|
Well yes, asciidoc(tor) will turn
into strong emphasis. But we're concerned here with how to represent literal asterisk characters. And asciidoctor turns
into
So we can't simply backslash-escape all the literal asterisks as you suggested. |
Thanks for explanation. I see above: #2337 (comment) where all this was explained. Sorry I didn't catch it the first time. I appreciate you taking the time to help. |
Pandoc version 1.15.0.6 doesn't correctly escape
asciidoc
outputwhich asciidoc would render back as ...
Unfortunately, the rules for escaping asciidoc special chars are complex and I cannot point to a single place in the asciidoc documentation. The general rule is that the '' character is used to escape. So with correct quoting/escaping ...
References
The text was updated successfully, but these errors were encountered: