Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add RegExp/Unicode topic (fun!) #1084

Merged
merged 1 commit into from
Nov 29, 2021
Merged

Conversation

michaelficarra
Copy link
Member

No description provided.

@mathiasbynens
Copy link
Member

mathiasbynens commented Nov 29, 2021

Some background in case it’s helpful:

  1. Approve #2515 via consensus and require all PRs affecting any of these tables to achieve consensus
  2. Allow loose matching of Unicode property values and possibly property names. Remove tables 70/71 and normatively reference Unicode

There doesn’t seem to be any demand for this. Doing this just to remove some spec maintenance churn is not a great motivation IMHO.

  1. Defer spelling of property values to that used in Unicode spec, even though it's explicitly non-canonical. Remove tables 70/71 and normatively reference Unicode

This might be painful to do precisely, since the Unicode spec mixes spelling/casing throughout different documents. What we’re using is generally the first spelling that’s used in the Unicode data files (but there are exceptions such as “Any” which is technically not a “character property”).

  1. Ask Unicode Consortium to provide canonical spellings for property values and possibly property names. Remove tables 70/71 and normatively reference Unicode

FWIW, I inquired about this while proposing \p{…} in ECMAScript: https://corp.unicode.org/pipermail/unicode/2016-May/thread.html#3648 See the “Canonical block names: spaces vs. underscores” thread. (I asked about Blocks specifically but it applies generally.)

  1. Updates to tables 70/71 use spelling from Unicode spec by convention, and do not require consensus

IMHO it’s worth pointing out explicitly in the slides that Option 5 is what we’ve been doing so far: https://github.com/tc39/ecma262/issues?q=label%3Aunicode+is%3Aclosed+Normative and that this agenda item is effectively re-litigating this. (I’m still hoping things don’t change, since any of the other options seem strictly worse to me.)

@michaelficarra
Copy link
Member Author

Confirming consensus on option 5 is fine with me, but previous discussion on the topic was not sufficiently clear to the editor group for us to take action.

@michaelficarra michaelficarra merged commit 29a9d0e into master Nov 29, 2021
@michaelficarra michaelficarra deleted the michaelficarra-patch-1 branch November 29, 2021 23:54
@bakkot
Copy link
Contributor

bakkot commented Nov 29, 2021

This might be painful to do precisely, since the Unicode spec mixes spelling/casing throughout different documents.

So, given that, what is the strategy we actually use for picking a spelling right now? Is it just "Mathias looks at the various options and picks a reasonable one, consistent with what we're already doing"? That seems like it's working fine, but certainly I at least was not aware in previous discussions that this is what we were agreeing to.

@mathiasbynens
Copy link
Member

This might be painful to do precisely, since the Unicode spec mixes spelling/casing throughout different documents.

So, given that, what is the strategy we actually use for picking a spelling right now? Is it just "Mathias looks at the various options and picks a reasonable one, consistent with what we're already doing"?

No, it’s this:

What we’re using is generally the first spelling that’s used in the Unicode data files (but there are exceptions such as “Any” which is technically not a “character property”).

These exceptions don’t apply for this specific case of new values for existing properties, but they would exist for the spec as a whole (which includes things like Any/ASCII/Assigned), which is why I believe becoming less explicit about this by referring to Unicode would make things more confusing / easier to misinterpret.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants