Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More flexible citation key syntax #6026 #6373

Closed
wants to merge 4 commits into from

Conversation

Aver1y
Copy link
Contributor

@Aver1y Aver1y commented May 17, 2020

This implements the {citekey} syntax and also allows escaping using \ both } and \ inside {} and any illegal character in the existing syntax.

This does not address all of #6026. I still want to implement the [@citekey]: citekey' syntax.

TODO:

  • Update documentation to reflect new behaviour of citekey parser regarding newlines.
  • Add tests?

@jgm
Copy link
Owner

jgm commented May 17, 2020

I'm not sure about the idea of allowing escapes in regular citations (without the {...}).
This changes current behavior in a way that might have some bad consequences.
For example, currently @doe\-1999 will create a citation to doe followed by the string -1999, whereas @doe-1999 will create a citation to doe-1999. I think that if we have the braces syntax, we don't need the escapes. And in braces, only braces and \\ need to be escapable. That keeps things simple.

@Aver1y Aver1y force-pushed the citation-key-syntax branch from d39951e to 3b8d38b Compare May 17, 2020 13:52
@Aver1y
Copy link
Contributor Author

Aver1y commented May 17, 2020

Changed.

Comment on lines +1478 to +1484
<|> char '{' *>
many `id` choice
[ noneOf "}\n\\"
, ' ' <$ newline <* skipSpaces <* notFollowedBy newline
, char '\\' *> anyChar
]
<* char '}'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for newlines, I think it would make sense to allow newlines only if not followed by a blankline (there's a parser for this

I think you mean spnl? That one is in Readers/Markdown.hs, so not available here unless we move it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered if it would be more consistent to parse any occurrence of multiple spaces as a single space. The way it is now I would think it may be especially confusing with trailing spaces? If one wants multiple spaces they can still escape them.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could collapse consecutive spaces, that might make sense.

@Aver1y Aver1y marked this pull request as draft May 21, 2020 22:36
@brainchild0
Copy link

brainchild0 commented May 29, 2020

I suggest an alternative approach for handling squiggly braces within a key, which is to parse nested brace or parenthesis pairs. Thus @{a{b}c(d)e[f]g<h>i} would correspond to key a{b}c(d)e[f]g<h>i, because only the final brace matches the first. Such rules would handle all cases but a few pathological ones, which I believe are currently only hypothetical. It also would avoid user difficulty related to escaping sequences, which may be long and often would be inserted by cut-and-paste.

If a later need arises to support the remaining cases, then I would suggest doing so through a slightly more explicit form, such as @{"pathological\}case"}. Perhaps this support might be postponed.

@Aver1y
Copy link
Contributor Author

Aver1y commented May 29, 2020

@brainchild0 I see the point for matching braces. It is probably more common for braces to appear paired than on their own. For other parenthesis I don't see the benefit.

@brainchild0
Copy link

brainchild0 commented May 30, 2020

Yes, you may be right that in this case matching other braces is unnecessary, as long as they are protected from the processing that would occur in a regular context. Thus the first ] in [@{protect]me}] should probably not close the [. However, fully applying the same rules for all brace types creates a convention that robustly generalizes to other scenarios, which later may be useful considering the many separate meanings given to only a few brace types during processing depending on context. It also reduces ambiguity because the above form may then be considered illegal and caused to generate a warning.

@dhimmel
Copy link

dhimmel commented Jun 17, 2020

An additional benefit of this pull request might be that it allows a way to terminate a citekey without space or punctuation. Specifically, what will @{happen}here? Am I correct that there prior to this PR there is no way to have a citekey happen immediately followed by here? And that this PR enables this?

Where this would come in handy is for filter like pandoc-fignos by @tomduck. Currently, to have a reference like "Figure 1A" with pandoc-fignos, the user has to write Figure {@fig:id}A. I think this PR would enable Figure @{fig:id}A without any special support by the filter.

@dhimmel
Copy link

dhimmel commented May 14, 2021

Looks like 3f09f53 added this feature.

@jgm
Copy link
Owner

jgm commented May 14, 2021

Oh, I'm sorry -- I'd completely forgotten about this PR!
Note that in my implementation, I don't allow backslash escapes at all. Braces are allowed, but only if matched.

@jgm jgm closed this May 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants