Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release/5.0] Handle non-ASCII strings in GetNonRandomizedHashCodeOrdinalIgnoreCase #45062

Merged
merged 19 commits into from
Nov 24, 2020

Conversation

github-actions[bot]
Copy link
Contributor

@github-actions github-actions bot commented Nov 21, 2020

Backport of #44688 and #44695 to release/5.0

/cc @jkotas @EgorBo

Customer Impact

Dictionary does not work with ignore-case ordinal comparer for certain non-English languages. Customer reported regression from 3.1.

Testing

Targeted test added

Risk

Medium

@Dotnet-GitSync-Bot
Copy link
Collaborator

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

@jkotas jkotas added the Servicing-consider Issue for next servicing release review label Nov 21, 2020
@jkotas
Copy link
Member

jkotas commented Nov 21, 2020

cc @danmosemsft

public void DictionaryOrdinalIgnoreCaseCyrillicKeys()
{
const string Lower = "абвгдеёжзийклмнопрстуфхцчшщьыъэюя";
const string Higher = "АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯ";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, coding-style.md

When including non-ASCII characters in the source code use Unicode escape sequences (\uXXXX) instead of literal characters. Literal non-ASCII characters occasionally get garbled by a tool or editor.

No need to fix it in this PR though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully appreciate where this guideline came from, but the trouble with it - it makes the code significantly harder to read and maintain (pertaining to this specific example).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. IMO when our unit tests use strings in non-English languages, we should just write the literal strings unescaped in their original language. If our unit tests are instead testing "I put a non-ASCII character here, let's see what happens!", then it's good to put the "\uXXXX" explicitly, since it draws attention to the fact that there's something unique about the character at this index in the string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's harder to read -- the concern is if garbelizing could somehow allow the test to continue to pass, without anyone notice.

Another option is to follow it with a Debug.Assert that compares them to their escaped forms. That way you maintain the readability, but garbelizing will immediately fail the test.

I don't feel strongly - if you think it is impossible that we would not notice.

@danmoseley
Copy link
Member

Failure was #41511 in one case, and an infrastructure issue in the other. Rerunning failed legs.

@GrabYourPitchforks GrabYourPitchforks added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Nov 24, 2020
@GrabYourPitchforks GrabYourPitchforks merged commit d9a4a64 into release/5.0 Nov 24, 2020
@GrabYourPitchforks GrabYourPitchforks deleted the backport/pr-44688-to-release/5.0 branch November 24, 2020 05:55
@ghost ghost locked as resolved and limited conversation to collaborators Dec 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Runtime Servicing-approved Approved for servicing release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants