-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release/5.0] Handle non-ASCII strings in GetNonRandomizedHashCodeOrdinalIgnoreCase #45062
[release/5.0] Handle non-ASCII strings in GetNonRandomizedHashCodeOrdinalIgnoreCase #45062
Conversation
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
cc @danmosemsft |
public void DictionaryOrdinalIgnoreCaseCyrillicKeys() | ||
{ | ||
const string Lower = "абвгдеёжзийклмнопрстуфхцчшщьыъэюя"; | ||
const string Higher = "АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЬЫЪЭЮЯ"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, coding-style.md
When including non-ASCII characters in the source code use Unicode escape sequences (\uXXXX) instead of literal characters. Literal non-ASCII characters occasionally get garbled by a tool or editor.
No need to fix it in this PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fully appreciate where this guideline came from, but the trouble with it - it makes the code significantly harder to read and maintain (pertaining to this specific example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. IMO when our unit tests use strings in non-English languages, we should just write the literal strings unescaped in their original language. If our unit tests are instead testing "I put a non-ASCII character here, let's see what happens!", then it's good to put the "\uXXXX" explicitly, since it draws attention to the fact that there's something unique about the character at this index in the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's harder to read -- the concern is if garbelizing could somehow allow the test to continue to pass, without anyone notice.
Another option is to follow it with a Debug.Assert that compares them to their escaped forms. That way you maintain the readability, but garbelizing will immediately fail the test.
I don't feel strongly - if you think it is impossible that we would not notice.
Failure was #41511 in one case, and an infrastructure issue in the other. Rerunning failed legs. |
Backport of #44688 and #44695 to release/5.0
/cc @jkotas @EgorBo
Customer Impact
Dictionary does not work with ignore-case ordinal comparer for certain non-English languages. Customer reported regression from 3.1.
Testing
Targeted test added
Risk
Medium