Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSL JSON author fields with style tags is mishandled #7603

Open
jblachly opened this issue Oct 1, 2021 · 6 comments
Open

CSL JSON author fields with style tags is mishandled #7603

jblachly opened this issue Oct 1, 2021 · 6 comments
Labels

Comments

@jblachly
Copy link

jblachly commented Oct 1, 2021

Manual states that HTML markup tags including <b> (bold), <i> (italic) and a few others may be used in BiBTeX and CSL items: https://pandoc.org/MANUAL.html#specifying-bibliographic-data . This is concordant with CSL specification.

However, pandoc mishandles this in at least two different ways, AFAICT.

Problem 1.

When going from CSLJSON -> (Whatever), HTML tags are escaped causing them to appear in the literal output stream.

Example:

[
	{
		"id": "http://zotero.org/users/193922/items/4QRBNRNF",
		"type": "webpage",
		"title": "dhtslib",
		"URL": "https://github.com/blachlylab/dhtslib/",
		"version": "0.12.0",
		"author": [
			{
				"family": "Gregory",
				"given": "Charles Thomas"
			},
			{
				"family": "<b>Blachly</b>",
				"given": "James S"
			}
		]
	}
]

Command: pandoc --citeproc -s -f csljson csl_min.json -o csl.md

---
nocite: "[@*]"
references:
- author:
  - family: Gregory
    given: Charles Thomas
  - family: \<b>Blachly\</b>
    given: James S
  id: "http://zotero.org/users/193922/items/4QRBNRNF"
  title: dhtslib
  type: webpage
  URL: "https://github.com/blachlylab/dhtslib/"
  version: 0.12.0
---

When rendering to, say, PDF, the literal <b> is inlined because of the escape character which you can see in the Markdown output, above.

(Side-note on a separate bug: when the output target is markdown, the standalone, nocite * bibliography isn't actually rendered, only the metadata header. In any other target format (PDF, HTML, etc.) the bibliography is rendered.)

Problem 2.

If one were to say, edit the markdown to remove the escape characters, pandoc STILL strips out the HTML tag when rendering.

$ sed -e "s/\\\</\</g;" csl.md

The rendered PDF still does not bold the author name.

Thanks in advance for taking a look at this, and thanks as always for Pandoc

jsbmbp13:~$ pandoc --version
pandoc 2.14.2
Compiled with pandoc-types 1.22, texmath 0.12.3.1, skylighting 0.11,
citeproc 0.5, ipynb 0.1.0.1
User data directory: /Users/james/.local/share/pandoc
Copyright (C) 2006-2021 John MacFarlane. Web:  https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.
jsbmbp13:~$ uname -a
Darwin jsbmbp13.local 18.7.0 Darwin Kernel Version 18.7.0: Sun Dec  1 18:59:03 PST 2019; root:xnu-4903.278.19~1/RELEASE_X86_64 x86_64
@jblachly jblachly added the bug label Oct 1, 2021
@jblachly jblachly changed the title CSL JSON with style tags is mishandled CSL JSON author fields with style tags is mishandled Oct 2, 2021
@jblachly
Copy link
Author

jblachly commented Oct 2, 2021

Clarified title to indicate that style tags (<b> <sup> etc.) are mishandled in Author name part fields. They appear to work correctly in e.g. title fields

@jgm
Copy link
Owner

jgm commented Oct 2, 2021

(Side-note on a separate bug: when the output target is markdown, the standalone, nocite * bibliography isn't actually rendered, only the metadata header. In any other target format (PDF, HTML, etc.) the bibliography is rendered.)

Not a bug: if you want a rendered bibliography in markdown, use markdown-citations.
Otherwise pandoc will assume you're rendering a markdown dialect that handles citations the way pandoc does, and hence doesn't need the rendered bibliography in the source.

@jgm
Copy link
Owner

jgm commented Oct 2, 2021

See jgm/citeproc#63 for the root issue.

@jgm
Copy link
Owner

jgm commented Oct 2, 2021

I would think that normally you'd want to handle this sort of thing in the style, rather than by encoding the boldface in the bibliography itself.

@jblachly
Copy link
Author

jblachly commented Oct 2, 2021

I would think that normally you'd want to handle this sort of thing in the style, rather than by encoding the boldface in the bibliography itself.

in CSL, <check> blocks can't operate on author-parts (for selective formatting, which is my use case -- my CV requires my name to be bolded)

@jblachly
Copy link
Author

jblachly commented Oct 2, 2021

See jgm/citeproc#63 for the root issue.

Regarding the root issue, I am able to use a luascript to replace e.g. author.family with metaInline [Strong ['text']] -- although this apparently only rewrites the metadata after the bibliography has been rendered; still working on figuring this part out . But, the AST correctly demonstrates that name parts can support MetaInline formatted text =)

@mb21 mb21 added citeproc and removed bug labels Nov 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants