Skip to content
This repository was archived by the owner on Feb 16, 2024. It is now read-only.

Bikeshed: flag + corresponding getter #14

Closed
mathiasbynens opened this issue Mar 18, 2021 · 16 comments
Closed

Bikeshed: flag + corresponding getter #14

mathiasbynens opened this issue Mar 18, 2021 · 16 comments

Comments

@mathiasbynens
Copy link
Member

Extracting the discussion from #2 (comment), if we want to gate the new syntax/semantics behind a new flag, there are two questions:

  1. What would the new flag be (which letter)? Some options are v (u as written in classical Latin) or w (double-u).
  2. What would the name of the corresponding getter on RegExp.prototype be? uniSet?

Let the bikeshedding commence.

@mathiasbynens mathiasbynens changed the title Flag + corresponding property Bikeshed: flag + corresponding property Mar 18, 2021
@markusicu
Copy link
Collaborator

Slight preference for v over w. “v is the next u...”

(“Only” in English is w=“double u”. French w="double v", German w="veh". I might reserve w for whatever comes after v...)

Idea for the getter: extCharClass (for “extended”)
Or maybe uniCharClass.

@sffc
Copy link
Collaborator

sffc commented Mar 18, 2021

+1 on v

Should expressions be /.../v (v implies/replaces u) or /.../uv (u enables code points, and v enables sets of strings)?

@markusicu
Copy link
Collaborator

Should expressions be /.../v (v implies/replaces u) or /.../uv (u enables code points, and v enables sets of strings)?

It should be /.../v (v implies/replaces u) as in the current version of the proposal (linked from issue #12).

  • No need to define the new features in non-Unicode mode
  • No need to require two flags when the new one alone makes no sense, better to imply/subsume

@sffc
Copy link
Collaborator

sffc commented Mar 18, 2021

  • No need to define the new features in non-Unicode mode

We don't need to do that necessarily; /v can just be invalid without /u, a restriction we could lift later.

  • No need to require two flags when the new one alone makes no sense, better to imply/subsume

Two flags makes it more explicit that this is BOTH a unicode regex AND one with nested string sets.

@macchiati
Copy link
Collaborator

It's more explicit, but it is superfluous. I think /u alone will fall to the wayside, and people will just find it an annoyance. "Oh, you forgot to use /uv — you just used /v and that doesn't work by itself."

@waldemarhorwat
Copy link

I also have a slight preference for just v. It would imply u while also switching over to the new [] syntax.

We can leave w for any future grapheme modes if we choose to do those.

@mathiasbynens
Copy link
Member Author

mathiasbynens commented Mar 20, 2021

Any opinions on the corresponding getter name? Here’s an overview of the current ECMAScript RegExp flags & getters:

assert(/…/d.hasIndices);
assert(/…/g.global);
assert(/…/i.ignoreCase);
assert(/…/m.multiline);
assert(/…/s.dotAll);
assert(/…/u.unicode);
assert(/…/y.sticky);

What would we do for v? Perhaps:

assert(//v.uniSet);
// or…
assert(//v.unicodeSet);

Now that I’ve written this down, I like unicodeSet, as it enables both unicode (u) + sets.

@macchiati
Copy link
Collaborator

uniset sounds good, or maybe even just sets

@sffc
Copy link
Collaborator

sffc commented Mar 21, 2021

One reason I kind of like two separate flags is because the new key can be named "stringSets" or similar, without needing to reference Unicode. But if we had one combined flag, we might just need to say "unicodeWithStringClasses".

@macchiati
Copy link
Collaborator

I really think of this as ES Unicode Regex v2; it encompasses and extends what was there before with the /u flag.

@mathiasbynens mathiasbynens changed the title Bikeshed: flag + corresponding property Bikeshed: flag + corresponding getter Apr 20, 2021
@markusicu
Copy link
Collaborator

Could we please discuss the name of the getter once more?

On its own, I agree "unicodeSet" makes sense. However, it's the same as the closely related ICU class UnicodeSet which has been around for 25-some years and supports a pattern string syntax similar to regex character classes, including string literals and most Unicode properties, yet with a different syntax, especially compared to this new proposal for regex set operators and string literals.

How about some of the other suggestions here?

Or "setOps", "stringSet", "classStringOps", ...?

Crazy idea: Could we change the "unicode" getter to return numeric value 2 instead of boolean false/true, when v is set? Would expressions like if (unicode) then ... still work?

@ljharb
Copy link
Member

ljharb commented May 27, 2021

.vunicode

@markusicu
Copy link
Collaborator

.unicode2

@sffc
Copy link
Collaborator

sffc commented May 27, 2021

Pending the resolution of #23, I prefer .stringSets or .stringClasses or something else with the word "string" since that is the main overarching feature that /v is adding.

@mathiasbynens
Copy link
Member Author

I’d like to propose resolving this bikeshed with v for the flag letter and unicodeSet for the getter name. Both parts of the getter name make sense, IMHO: unicode → enables the use of Unicode properties of strings, and Set → enables set notation. Let’s discuss during this week’s meeting.

@mathiasbynens
Copy link
Member Author

During yesterday’s weekly sync we decided to proceed with v as the flag name, and unicodeSets (note: plural!) for the getter name. Closing this issue to mark its resolution. We’ll update the spec draft later.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants