Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typst: different typst output when writing to .typ file or to pdf with the typst engine #10320

Closed
ZoomRmc opened this issue Oct 21, 2024 · 11 comments

Comments

@ZoomRmc
Copy link

ZoomRmc commented Oct 21, 2024

This one is kind of weird. I'm experiencing different typst output when writing to *.typ file or to pdf with --pdf-engine typst, the latter being broken.

After examining the temporary file produced, it seems like something is happening around the stage when the output is combined with the template file.

Here's steps to reproduce:

  1. RST input converted to a native representation, correctly (not counting the unsupported RST Option List).
    **A**: aaa
    
      - as
    
    **B**: bbb
      - bs
      
    **C**: ccc
      - cs
      
    D D
        ddd
    
    \-E E
      eee
    
    ..
      comment
      
    -F  F
        fff
  2. Native output converter to a typst file (pandoc test.native --from=native -o test.typ), correctly.
     #strong[A];: aaa
     
     #quote(block: true)[
     - as
     ]
     
     / #strong[B];\: bbb: #block[
     - bs
     ]
     
     / #strong[C];\: ccc: #block[
     - cs
     ]
     
     / D D: #block[
     ddd
     ]
     
     / -E E: #block[
     eee
     ]
     
     / -F F: #block[
     fff
     ]
  3. The typst file gets compiled to PDF twice:
    • with typst compile test.typ test-typst.pdf, producing expected results
    • with pandoc test.typ --pdf-engine=typst test-pandoc.pdf, producing broken output
      issue
  4. The main body of the document, as written to the temporary typst file generated (examined with the tool), contains mangled typst output, which explains the incorrect rendering. The main issue is that somehow the term marker becomes text.
    #strong[A];: aaa
    
    #quote(block: true)[
    
    - as
    
    ℄~
    ]
    
    \/ #strong[B];: bbb:
    
    - bs
    
    \/ #strong[C];: ccc:
    
    - cs
    
    \/ D D: ddd
    
    \/ -E E: eee
    
    \/ -F F: fff

It was pretty surprising to see typst code in the document body being different depending on being written directly or composed with the default template and used as a temporary file.

No external templates or filters were used.

Encountered with pandoc 3.2, reproduced with 3.5 & 3.5-nightly-2024-10-21

@ZoomRmc ZoomRmc added the bug label Oct 21, 2024
@jgm
Copy link
Owner

jgm commented Oct 21, 2024

Are you on Windows? I wonder if there could be an encoding issue somewhere.

Have you tried --verbose? This will print out the intermediate typst file.

@jgm
Copy link
Owner

jgm commented Oct 21, 2024

For what it's worth, I can't reproduce the issue on my system (macos).
This suggests it's Windows related.

@jgm
Copy link
Owner

jgm commented Oct 21, 2024

Note that when the typst document is written to the temp directory, it will be (a) UTF-8 encoded and (b) have LF rather than CRLF line endings. (b) accounts for the line ending issues you're seeing (since you're viewing this in a Windows editor) and (a) may account for the garbled text. However, I would have expected that typst would behave fine with this sort of input.

@ZoomRmc
Copy link
Author

ZoomRmc commented Oct 21, 2024

Reproducible on both Windows 10 and Linux machines. Does not depend on the LF/CRLF of the input .typ file.

EDIT: there's some difference:

  • On Windows the output typst is exactly the same regardless if using the intermediate .typ file or no.
  • On Linux the output is mangled only if using the .typ file as input. Conversion directly from native/rst works fine.

@ZoomRmc ZoomRmc changed the title Typst writer: different typst output when writing to .typ file or to pdf with the typst engine Typst: different typst output when writing to .typ file or to pdf with the typst engine Oct 21, 2024
@jgm
Copy link
Owner

jgm commented Oct 22, 2024

OK, somehow I'd missed the fact that you're using pandoc to convert typst to typst (probably because this is an odd thing to do). You may not realize that you're doing this, but when you do

pandoc test.typ --pdf-engine=typst test-pandoc.pdf

pandoc will

(a) parse test.typ to a pandoc AST
(b) then convert that AST back to a typst document
(c) and finally compile it using typst

So this really doesn't have anything to do with PDF. You can reproduce the issue using

 % pandoc -f typst -t typst
#strong[A];: aaa
 
 #quote(block: true)[
 - as
 ]
 
 / #strong[B];\: bbb: #block[
 - bs
 ]
 
 / #strong[C];\: ccc: #block[
 - cs
 ]
 
 / D D: #block[
 ddd
 ]
 
 / -E E: #block[
 eee
 ]
 
 / -F F: #block[
 fff
 ]
^D
#strong[A];: aaa

#quote(block: true)[

- as

℄~
]

\/ #strong[B];: bbb:

- bs

\/ #strong[C];: ccc:

- cs

\/ D D: ddd

\/ -E E: eee

\/ -F F: fff

@jgm
Copy link
Owner

jgm commented Oct 22, 2024

Apparently it's an issue with the typst reader, which parses the typst document to

[ Para [ Strong [ Str "A" ] , Str ":" , Space , Str "aaa" ]
, BlockQuote
    [ Para []
    , BulletList [ [ Para [ Str "as" ] ] ]
    , Para [ Str "\8452\160" ]
    ]
, Para
    [ Str "/"
    , Space
    , Strong [ Str "B" ]
    , Str ":"
    , Space
    , Str "bbb:"
    ]
, Para []
, BulletList [ [ Para [ Str "bs" ] ] ]
, Para
    [ Str "/"
    , Space
    , Strong [ Str "C" ]
    , Str ":"
    , Space
    , Str "ccc:"
    ]
, Para []
, BulletList [ [ Para [ Str "cs" ] ] ]
, Para
    [ Str "/"
    , Space
    , Str "D"
    , Space
    , Str "D:"
    , SoftBreak
    , Str "ddd"
    ]
, Para
    [ Str "/"
    , Space
    , Str "-E"
    , Space
    , Str "E:"
    , SoftBreak
    , Str "eee"
    ]
, Para
    [ Str "/"
    , Space
    , Str "-F"
    , Space
    , Str "F:"
    , SoftBreak
    , Str "fff"
    ]
]

@jgm
Copy link
Owner

jgm commented Oct 22, 2024

What's weird is that typst-hs, which does the main work in the typst reader, seems to parse this correctly:
for

#quote(block: true)[
- as
]

it produces

--- repr ---
document(body: { quote(block: true, 
                       body: { text(body: [
]), 
                               list(children: ({ text(body: [as]), 
                                                 parbreak() })) }), 
                 parbreak() })

jgm added a commit that referenced this issue Oct 22, 2024
This affects attributions in quote blocks.
See #10320.
jgm added a commit that referenced this issue Oct 22, 2024
If attribution is not present, don't print the `--`.

See #10320.
@jgm
Copy link
Owner

jgm commented Oct 22, 2024

I fixed a couple small issues. There remains the issue of the failed term list parsing. Simple example:

/ B: #block[
- bs
]

This is an issue in jgm/typst-hs, which isn't recognizing this as a term list.

@jgm
Copy link
Owner

jgm commented Oct 22, 2024

OK, I see what is going on. typst-hs takes the typst documentation literally: "When the descriptions span over multiple lines, they use hanging indent to communicate the visual hierarchy." It requires a hanging indent, which isn't present in this case.

@jgm jgm closed this as completed in 3522025 Oct 22, 2024
@jgm
Copy link
Owner

jgm commented Oct 22, 2024

OK, that was three bugs for the price of one, but everything should work now.

@ZoomRmc
Copy link
Author

ZoomRmc commented Oct 22, 2024

Thanks a lot and sorry for the messy report.

Yeah, I understand how it all works more or less. There was confusion when I initially stumbled upon this bug, so I just tried to determine the point at which the bug occurs (unsuccessfully though). What threw me off is that the bug showed itself even when the reader wasn't supposed to be used at all (based on my understanding at the moment) and the program clearly could generate valid typst.

native -> pdf bug
native -> typ ok
typ -> pdf bug

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants