Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Native Digits Support #71045

Merged
merged 6 commits into from
Jun 22, 2022
Merged

Fix Native Digits Support #71045

merged 6 commits into from
Jun 22, 2022

Conversation

tarekgh
Copy link
Member

@tarekgh tarekgh commented Jun 21, 2022

We always assumed the native digits one character. This was the case for NLS. Now in ICU, there are some locales which use native digits expressed as surrogate pairs. ccp-Cakm-BD locale is a good example of that which uses the native digits "\U0001E950", "\U0001E951", "\U0001E952", "\U0001E953", "\U0001E954", "\U0001E955", "\U0001E956", "\U0001E957", "\U0001E958", "\U0001E959". This change is to allow reading the native digits correctly from ICU.

@ghost
Copy link

ghost commented Jun 21, 2022

Tagging subscribers to this area: @dotnet/area-system-globalization
See info in area-owners.md if you want to be subscribed.

Issue Details

We always assumed the native digits always one character. This was the case for NLS. Now in ICU, there are some locales which use native digits expressed as surrogate pairs. ccp-Cakm-BD locale is a good example of that which uses the native digits "\U0001E950", "\U0001E951", "\U0001E952", "\U0001E953", "\U0001E954", "\U0001E955", "\U0001E956", "\U0001E957", "\U0001E958", "\U0001E959". This change is to allow reading the native digits correctly from ICU.

Author: tarekgh
Assignees: -
Labels:

area-System.Globalization

Milestone: -

@tarekgh tarekgh added this to the 7.0.0 milestone Jun 21, 2022
@tarekgh
Copy link
Member Author

tarekgh commented Jun 21, 2022

CC @eerhardt @maryamariyan if you can help review it.

@danmoseley
Copy link
Member

Does this affect treatment of \d in a regex pattern?

@stephentoub
Copy link
Member

Does this affect treatment of \d in a regex pattern?

No

@tarekgh
Copy link
Member Author

tarekgh commented Jun 22, 2022

@eerhardt @stephentoub do you have any more comments?

Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @tarekgh!

@tarekgh tarekgh merged commit 315e931 into dotnet:main Jun 22, 2022
@tarekgh tarekgh deleted the FixNativeDigitsSupport branch June 22, 2022 23:00

} while (ffffPos < digits.Length && index < 10);

Debug.Assert(index >= 10, $"Couldn't read native digits for '{_sWindowsName}' successfully.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be index == 10? result is exactly 10 items long, so if we tried to do result[10], it would have thrown.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can do that. I am not sure if this is worth another PR though.

@ghost ghost locked as resolved and limited conversation to collaborators Jul 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants