Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docbook (asciidoctor) task lists #8011

Closed
arcnmx opened this issue Apr 8, 2022 · 3 comments
Closed

docbook (asciidoctor) task lists #8011

arcnmx opened this issue Apr 8, 2022 · 3 comments
Labels

Comments

@arcnmx
Copy link
Contributor

arcnmx commented Apr 8, 2022

Explain the problem.
asciidoctor has a peculiar way of encoding check lists:

asciidoctor -b docbook5 - <<EOF | pandoc -f docbook -t native
* [ ] a
* [x] b
EOF
[ BulletList <itemizedlist mark="none">
    [ [ Para [ Str "\10063" , Space , Str "a" ] ] <listitem><simpara>&#10063; a</simpara></listitem>
    , [ Para [ Str "\10003" , Space , Str "b" ] ] <listitem><simpara>&#10003; b</simpara></listitem>
    ] </itemizedlist>
]

Which is distinct from what pandoc generally produces/expects:

pandoc -f gfm -t native <<EOF
- [ ] a
- [x] b
EOF
[ BulletList
    [ [ Plain [ Str "\9744" , Space , Str "a" ] ]
    , [ Plain [ Str "\9746" , Space , Str "b" ] ]
    ]
]

The following transforms therefore do not work as expected:

asciidoctor -b docbook5 - < example-from-above.adoc | pandoc -f docbook -t gfm
-   ❏ a

-   ✓ b
asciidoctor -b docbook5 - < example-from-above.adoc | pandoc -f docbook -t asciidoctor
* ❏ a
* ✓ b

It's not clear to me whether this is something that would be the writer's responsibility or if it would make more sense for the docbook reader to parse these?

pandoc --version
pandoc 2.17.1.1
Compiled with pandoc-types 1.22.1, texmath 0.12.4, skylighting 0.12.3, citeproc 0.6.0.1, ipynb 0.2

EDIT: also what's with the weird spacing on gfm lists after the hyphen?

@arcnmx arcnmx added the bug label Apr 8, 2022
@jgm
Copy link
Owner

jgm commented Apr 8, 2022

Looks like asciidoctor is using boxes with shadows, where pandoc uses boxes without shadows.

We should be able to modify taskListItemToAscii to handle both variations.

@jgm
Copy link
Owner

jgm commented Apr 9, 2022

EDIT: also what's with the weird spacing on gfm lists after the hyphen?

Pandoc's markdown writer tries to produce markdown that will work with the widest range of processors, and some markdown processors historically have used the "4 space rule," requiring 4 space indent for sublists. Given this, and the desire that sublists align with the content of the containing list, we just start all list content on 4-space boundaries.

In principle we could override this behavior for gfm/commonmark, since in that case the list nesting behavior is well defined.

@jgm
Copy link
Owner

jgm commented Apr 9, 2022

And yes, the way we handle this currently it's the responsibility of the writer to detect these task lists and format them appropriately. (Only a few formats have task lists, so in other cases we have a nice fallback behavior.)

@jgm jgm closed this as completed in 813f3d5 Apr 9, 2022
jgm added a commit that referenced this issue Apr 9, 2022
on bullet lists.  They are now nested by 2 spaces instead of 4.

See #8011.
@jgm jgm mentioned this issue Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants