Unintentional escape characters in odoc output (html) #620

jordwalke · 2021-03-03T06:45:56Z

In this odoc output for Unix, look at the function val link.
Here's what it looks like:

val link : ?⁠follow:bool -> string -> string -> unit

but copy and paste it into your clipboard, and assign it into a string in JavaScript, and look at the character position where f in "follow" should be.

var text = "val link : ?⁠follow:bool -> string -> string -> unit"
alert(text.charCodeAt(12));

It prints the char code 8288. This means there's some non printable char codes which make copy/pasting items from the page into utop or a file frustratingly fail to compile (you'll get errors about weird characters but can't see with your eyes where they are until you inspect byte by byte).

I'm assuming there was some reason why these were in there? Does anyone know the back story there?

The text was updated successfully, but these errors were encountered:

dbuenzli · 2021-03-03T07:56:28Z

I'm assuming there was some reason why these were in there?

That's U+2060, it was a failed attempt at controlling breaking behaviour. It seems they were never removed in the end. They should be.

jordwalke · 2021-03-03T08:08:37Z

Cool. I also found some other ones in there. (160 - non breaking space)

dbuenzli · 2021-03-03T08:27:20Z

(160 - non breaking space)

But are these problematic for c&p ? I tested the other day and they were not on my side.

I was actually planning to add more, since inline-blocking spans which made a very good job unfortunately has usability issues.

jordwalke · 2021-03-03T08:50:24Z

Yes, in my case they were (but it was parsing the ocaml signature programmatically - perhaps there's something in the way I'm using the ocaml parser?). Do you expect ocaml to accept non-breaking spaces in ocaml files (non comment regions at least).

dbuenzli · 2021-03-03T09:06:18Z

The challenge is to have reasonably readable and responsive signatures while not breaking cut and paste.

I don't know exactly why you are parsing OCaml out of these HTML files which are aimed at presentation. But if you are doing this it shouldn't be too hard to filter out the nbsp characters outside from character literals before feeding them to the OCaml parser.

Do you expect ocaml to accept non-breaking spaces in ocaml files (non comment regions at least).

No.

jordwalke · 2021-03-03T09:09:21Z

I agree it's reasonable to require anyone using the html output as a kind of input to another system to strip the characters (which is what I did and it's not hard) but copy/pasting is the more important use case.

It breaks cut and paste in the HTML renderer and other renderers have to ignore it. Closes ocaml#620.

It breaks cut and paste in the HTML renderer and other renderers have to ignore it. Closes #620.

dbuenzli added a commit to dbuenzli/odoc that referenced this issue Mar 12, 2021

IR model: stop trying to play with word joiners (U+2060).

9a37fbb

It breaks cut and paste in the HTML renderer and other renderers have to ignore it. Closes ocaml#620.

dbuenzli mentioned this issue Mar 12, 2021

IR model: stop trying to play with word joiners (U+2060) #638

Merged

jonludlam closed this as completed in #638 Mar 12, 2021

jonludlam pushed a commit that referenced this issue Mar 12, 2021

IR model: stop trying to play with word joiners (U+2060).

9d7eb20

It breaks cut and paste in the HTML renderer and other renderers have to ignore it. Closes #620.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unintentional escape characters in odoc output (html) #620

Unintentional escape characters in odoc output (html) #620

jordwalke commented Mar 3, 2021

dbuenzli commented Mar 3, 2021

jordwalke commented Mar 3, 2021

dbuenzli commented Mar 3, 2021

jordwalke commented Mar 3, 2021

dbuenzli commented Mar 3, 2021

jordwalke commented Mar 3, 2021

Unintentional escape characters in odoc output (html) #620

Unintentional escape characters in odoc output (html) #620

Comments

jordwalke commented Mar 3, 2021

dbuenzli commented Mar 3, 2021

jordwalke commented Mar 3, 2021

dbuenzli commented Mar 3, 2021

jordwalke commented Mar 3, 2021

dbuenzli commented Mar 3, 2021

jordwalke commented Mar 3, 2021