You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Per the following two documents, comparing strings using StringComparison.OrdinalIgnoreCase is explicitly documented as equivalent to calling ToUpperInvariant on each string, then performing an ordinal comparison against the contents.
These statements within the docs imply that ToUpperInvariant and ToLowerInvariant are ordinal case conversions ("simple case mapping"), not linguistic case conversions. However, it looks like we're not consistenly following this pattern.
strings1="s";strings2="\u017f";// Latin Sharp S, which uppercase-maps to a normal ASCII "S"Console.WriteLine(s1.Equals(s2,StringComparison.OrdinalIgnoreCase));// FalseConsole.WriteLine(s1.ToUpperInvariant()==s2.ToUpperInvariant());// True
This has collateral impact. For example, recent PRs like #67758 assume that non-ASCII characters cannot case-map to ASCII characters, which is not a guarantee offered by Unicode, but which might be a guarantee we'd be willing to make separately within the runtime by munging the Unicode tables.
See also #30960 for further discussion on case mapping as a more general Unicode concept.
Tagging subscribers to this area: @dotnet/area-system-globalization
See info in area-owners.md if you want to be subscribed.
Issue Details
Per the following two documents, comparing strings using StringComparison.OrdinalIgnoreCase is explicitly documented as equivalent to calling ToUpperInvariant on each string, then performing an ordinal comparison against the contents.
These statements within the docs imply that ToUpperInvariant and ToLowerInvariant are ordinal case conversions ("simple case mapping"), not linguistic case conversions. However, it looks like we're not consistenly following this pattern.
strings1="s";strings2="\u017f";// Latin Sharp S, which uppercase-maps to a normal ASCII "S"Console.WriteLine(s1.Equals(s2,StringComparison.OrdinalIgnoreCase));// FalseConsole.WriteLine(s1.ToUpperInvariant()==s2.ToUpperInvariant());// True
This has collateral impact. For example, recent PRs like #67758 assume that non-ASCII characters cannot case-map to ASCII characters, which is not a guarantee offered by Unicode, but which might be a guarantee we'd be willing to make separately within the runtime by munging the Unicode tables.
See also #30960 for further discussion on case mapping as a more general Unicode concept.
Per the following two documents, comparing strings using
StringComparison.OrdinalIgnoreCase
is explicitly documented as equivalent to callingToUpperInvariant
on each string, then performing an ordinal comparison against the contents.These statements within the docs imply that ToUpperInvariant and ToLowerInvariant are ordinal case conversions ("simple case mapping"), not linguistic case conversions. However, it looks like we're not consistenly following this pattern.
This has collateral impact. For example, recent PRs like #67758 assume that non-ASCII characters cannot case-map to ASCII characters, which is not a guarantee offered by Unicode, but which might be a guarantee we'd be willing to make separately within the runtime by munging the Unicode tables.
See also #30960 for further discussion on case mapping as a more general Unicode concept.
/cc @tarekgh, who had thoughts on this offline.
The text was updated successfully, but these errors were encountered: