Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Links with slashes capturing trailing semicolons #98

Closed
georgediaz88 opened this issue Sep 30, 2021 · 4 comments
Closed

Links with slashes capturing trailing semicolons #98

georgediaz88 opened this issue Sep 30, 2021 · 4 comments

Comments

@georgediaz88
Copy link

Hi,

Using the demo site, I noticed links with slashes capturing trailing semicolons. This works well for trailing periods but not for trailing semicolons as I expected. Here are some examples:

  1. Trailing period after slash. Works as expected:
www.google.com/.

=> www.google.com/
  1. Trailing semicolon, no slash. Works as expected:
www.google.com;

=> www.google.com
  1. Trailing period after slash. Works as expected:
www.google.com/.

=> www.google.com/
  1. Semicolon after slash. Doesn't work as expected:
www.google.com/;

=> www.google.com/;

(I would've expected it to capture like `www.google.com/`)

Is the last example above by design? There are some academic disciplines that separate a list of urls with semicolons. So, that's the behavior we're trying to match with this plugin.

Your help would be greatly appreciated.

Thanks!
George

@puzrin
Copy link
Member

puzrin commented Sep 30, 2021

Is the last example above by design? There are some academic disciplines that separate a list of urls with semicolons. So, that's the behavior we're trying to match with this plugin.

Euristic algorythms can not guarantee right result for 100%.

; is valid char for URL. I see 2 alternatives:

  • You can patch regexp-s in your clone as you need, with some risk to break other cases
  • Try to provide formal description how to decide when ; should not be part of url, without side-effects.

Also, please provide real world examples, where ; is used as you describe (several samles of documents). That should help to invent proper rule.

PS. At first glance, "; can not be part of link, if followed with space" - may work.

@georgediaz88
Copy link
Author

Hi @puzrin,

Thanks for your quick response.

Regarding:

Try to provide formal description how to decide when ; should not be part of url, without side-effects.

The way it could work would be just like you noted, exactly: "PS. At first glance, "; can not be part of link, if followed with space" - may work."

Here are some real world examples:

See Nathan Bomey & Marco della Cava, Sexual Harassment Went Unchecked for Decades as Payouts Silenced Accusers, USA Today (Dec. 1, 2017), https://www.usatoday.com/story/money/business/2017/12/01/sexual-harassment-went-unchecked-decades-payouts-silenced-accusers/881070001/; Lyn Yonack, Sexual Assault Is About Power: How #MeToo Campaign Is Restoring Power to Victims, Psychol. Today (Nov. 14, 2017), https://www.psychologytoday.com/us/blog/psychoanalysis-unplugged/201711/sexual-assault-is-about-power.

See Carl Hulse, Political Polarization Takes Hold of the Supreme Court, N.Y. Times (July 5, 2018), https://www.nytimes.com/2018/07/05/us/politics/political-polarization-supreme-court.html; Kevin Schaul & Kevin Uhrmacher, Analysis: How Trump Is Shifting the Most Important Courts in the Country, Wash. Post (Sept. 4, 2018),

The role that per curiam decisions do and should play, particularly when the Court does not speak with a unified voice, is quite interesting and a topic for further exploration. See https://www.theatlantic.com/ideas/archive/2018/06/the-court-slices-a-narrow-ruling-out-of-masterpiece-cakeshop/561986/; Ira P. Robbins, Hiding Behind the Cloak of Invisibility: The Supreme Court and Per Curiam Opinions, 86

See David Orr, Poets, Academia: A Couplet in Conflict, N.Y. Times (May 30, 2009), https://www.nytimes.com/2009/05/31/weekinreview/31orr.html; Steven L. Winter, Death Is the Mother of Metaphor, 105 Harv. L. Rev. 745, 749--50 (1992) (describing poet Wallace Stevens's relationship to the study of law and legal language).

You'll also see Github excludes the trailing semicolon from these links as expected.

Currently, our users end up correcting links like these in our markdown by setting a markdown link with everything in the link excluding the trailing semicolon. Of course, it would be best if this library handled this scenario to solve that pain point.

Let me know what you think.

Thanks!
George

@puzrin puzrin closed this as completed in 2014a2c Oct 1, 2021
@puzrin
Copy link
Member

puzrin commented Oct 1, 2021

Try v3.0.3

@georgediaz88
Copy link
Author

@puzrin, whoa this is awesome!! 🎉

IT WORKS!

I appreciate you quickly adding this for me / my team. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants