Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Zotero citations in docx #7840

Closed
tarleb opened this issue Jan 17, 2022 · 9 comments
Closed

Support Zotero citations in docx #7840

tarleb opened this issue Jan 17, 2022 · 9 comments

Comments

@tarleb
Copy link
Collaborator

tarleb commented Jan 17, 2022

Sample code:

        <w:p w14:paraId="2C4DA6EC" w14:textId="293040C8" w:rsidR="009E522F" w:rsidRDefault="002E5F17">
            <w:r>
                <w:fldChar w:fldCharType="begin"/>
            </w:r>
            <w:r>
                <w:instrText xml:space="preserve"> ADDIN ZOTERO_ITEM CSL_CITATION {"citationID":"AQwSemPs","properties":{"formattedCitation":"(Hawking, 2010)","plainCitation":"(Hawking, 2010)","noteIndex":0},"citationItems":[{"id":46,"uris":["http://zotero.org/users/40613/items/EAG35HWU"],"uri":["http://zotero.org/users/40613/items/EAG35HWU"],"itemData":{"id":46,"type":"article-journal","title":"Test article one","author":[{"family":"Hawking","given":"Stephen"}],"issued":{"date-parts":[["2010"]]}}}],"schema":"https://github.com/citation-style-language/schema/raw/master/csl-citation.json"} </w:instrText>
            </w:r>
            <w:r>
                <w:fldChar w:fldCharType="separate"/>
            </w:r>
            <w:r w:rsidRPr="002E5F17">
                <w:rPr>
                    <w:rFonts w:ascii="Calibri" w:hAnsi="Calibri" w:cs="Calibri"/>
                </w:rPr>
                <w:t>(Hawking, 2010)</w:t>
            </w:r>
            <w:r>
                <w:fldChar w:fldCharType="end"/>
            </w:r>
        </w:p>

My understanding is that the XML element comes with a full CSL JSON entry, so this we can parse that.

@jgm
Copy link
Owner

jgm commented Jan 17, 2022

So is the idea to parse this from docx into a native pandoc Cite inline?
The fallback text could be the formattedCitation part of the JSON.
In addition, the bibliographic information would have to be extract, converted, and added to references in metadata.
Good idea, I think, and it should be fairly straightforward.

@jgm jgm changed the title Support Zotero citations in docs Support Zotero citations in docx Jan 17, 2022
@jgm
Copy link
Owner

jgm commented Jan 17, 2022

Can you upload a complete document containing these, for testing?

@frederik
Copy link

Hi @jgm, here's a DOCX with a number of references (including prefixes etc.) for testing. I included the reference list generated by Zotero which looks like it would not be needed if we have the original in-text citations.

zotero-citations.docx

jgm added a commit that referenced this issue Jan 19, 2022
So far this just adds a constructor for FieldInfo;
we'll need to adjust the rest of the reader code to
parse the JSON and do something with it.

See #7840.
@jooyoungseo
Copy link

Would there be any further detailed explanation about this change in user guide? As far as I understood, Zotero fields are now automaticlaly converted into @citation_key and its corresponding bibliography entries in the docx-to-md conversion. Is that correct? Please correct me if I am wrong.

@jgm
Copy link
Owner

jgm commented Feb 2, 2022

Not yet. We've only implemented the skeleton for this so far.

@jgm
Copy link
Owner

jgm commented Feb 2, 2022

Can someone upload a sample docx using these Zotero fields? It would be good if it demonstrated the following:

  • multiple citations that include the same item (e.g. (Smith 2000; Jones 1999) and later (Jones 1999)
  • locators (e.g. (Smith 2000, ch. 3) and prefixes/suffixes if these are an option

@frederik
Copy link

frederik commented Feb 2, 2022

Adding an example:

(Jones, 1999; Smith, 2000) and  the same book again (Jones, 1999)
(see Smith, 2000, Chapter 22 and others)

The first line creates two citations with 2 (Jones & Smith) and 1 citation item (Jones). The item date data from the citations in the re-used book are the same (in this case id is 273 in both cases).

The second line contains the item data for Smith again (same ID 272 as in the first citation). The locator chapter is added to the citation item.

zotero-citations-2.docx

If you need anything else (or would like a discussion on the JSON structures, I'd be happy to help).

@jgm
Copy link
Owner

jgm commented Feb 3, 2022

Here's a formatted version of the embedde JSON in the above example:

{
  "citationID": "AQwSemPs",
  "properties": {
    "formattedCitation": "(Hawking, 2010)",
    "plainCitation": "(Hawking, 2010)",
    "noteIndex": 0
  },
  "citationItems": [
    {
      "id": 46,
      "uris": [
        "http://zotero.org/users/40613/items/EAG35HWU"
      ],
      "uri": [
        "http://zotero.org/users/40613/items/EAG35HWU"
      ],
      "itemData": {
        "id": 46,
        "type": "article-journal",
        "title": "Test article one",
        "author": [
          {
            "family": "Hawking",
            "given": "Stephen"
          }
        ],
        "issued": {
          "date-parts": [
            [
              "2010"
            ]
          ]
        }
      }
    }
  ],
  "schema": "https://github.com/citation-style-language/schema/raw/master/csl-citation.json"
}

@jgm
Copy link
Owner

jgm commented Feb 3, 2022

We'll need to modify the CitationItem type in citeproc to allow this kind of embedded itemData:

itemData :: Reference a

After that, we can simply use the FromJSON instance for Citeproc.Citation to parse this. And then we'll need to (a) convert this to a Pandoc.Citation and (b) extract the embedded Reference and put it in state, so it can be added to the references in Metadata.

jgm added a commit that referenced this issue Feb 3, 2022
- Add docxReferences to state, so we can accumulate
  references for metadata.
- Add a clause for ZoteroItem to parPartToInlines'.
  So far it doesn't do anything except add a surrounding Cite element.

See #7840.
jgm added a commit that referenced this issue Feb 4, 2022
This gives us what we ned for #7840, except adding
to the references in metadata.
@jgm jgm closed this as completed in 60caa0a Feb 4, 2022
jgm added a commit that referenced this issue Feb 4, 2022
These are supported in the same way as Zotero citations,
using the same code.  As with Zotero, enable the `citations`
extension on `docx` to parse these as native citations.

Closes #7840.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants