-
Notifications
You must be signed in to change notification settings - Fork 7.3k
localeCompare regression #25762
Comments
You can use the ICU collation explorer to view what's happening with collation.
Actually, 0.10 seems to be using code point sort order. So the letters are basically in ASCII / iso-8859-1 order. Consider this on .10 and .12:
So, don't expect localeCompare to really be locale sensitive in .10. .12 has the correct behavior. You should be able to get the ASCII behavior with:
… which doesn't work for me. There may be a bug there. But the original report is definitely working as designed. |
https://github.com/joyent/node/blob/v0.10.40-release/deps/v8/src/string.js#L166 says this function is "implementation specific". |
thanks @srl295 this is eye opening information! I came to realize that I only expect ASCII characters so ><= seem the most adequate (and version consistent) option. |
@yaronn welcome- Who needs anything but ASCII? But can you explain the use case a bit more? Perhaps there's another way to do it. |
I am writing xml-crypto (https://github.com/yaronn/xml-crypto). As part of the exclusive canonicalization algorithm I need to sort all attributes of each xml element by alphabetical order where caps are first. |
@yaronn 'Canonical' and 'Locale' aren't compatible concepts… locales change, by user and over time. Consider the following example, which is working 100% correctly: (same JS, different user)
Specifically, "ZZZ" sorts before "aaa" , with If you mean |
@yaronn I followed your document (thanks for the link) and found section 2.2 of REC-xml-c14n-20010315. (emphasis mine)
So, using "a"<"Ab" — or even plain
|
Thanks @srl295 for some eye opening information :) a lot of these specs are ambiguous which is why various implementations do not always interop. Thanks! |
@srl295: Re the use of "lexicographic", it's used in the general mathematical sense; see https://en.wikipedia.org/wiki/Lexicographical_order. |
@duerst Thank you for the response and clarification. You are correct of course, UTF-8 is in binary order, it is UTF-16 which is not. |
Hi
It seems node 0.12.x has a a different behavior for localeCompare than 0.10.40.
the above can be fixed by using:
but this one yields a different result and no configuration can fix it:
the different numbers are less significant - the real issue is the different sign.
The text was updated successfully, but these errors were encountered: