-
Notifications
You must be signed in to change notification settings - Fork 65
Poor quality algorithm to convert SHA-1 hash to decimal value for bucketing #157
Comments
I can prepare a PR to implement this. What's the minimum supported version of Node that my PR would need to be compatible with? |
Thanks for bringing this to our attention. Before you do any more work on this, I think we need to think it over here a bit. I agree that the 53 vs. 60 thing is clearly a problem, and is no doubt causing some inconsistency with the behavior of the other SDKs. But for that same reason (consistency across platforms) we can't make unilateral changes in Node without having a plan for what we're going to do in the others and how we'll roll out the change. That's also why we haven't acted on other hash-related suggestions like using Murmur. |
OK, fair enough. But I expect that if the other implementations are also consuming 60 bits, then they too would run into the same bug because every programming language uses the same IEEE 754 floating point standard. To illustrate the problem more clearly, when consuming 60 bits, it's possible to output exactly 1.0, which would then fail to fall into any bucket. WIth 53 bits, the maximum possible bucketing value is 0.9999999999999999, which would fall into the last bucket. Run this in the latest version of Node: const hash60a = 0xFFFFFFFFFFFFFB0n; // First 53 bits are 1
const hash53a = hash60a >> 7n;
console.log("Test A, Rounding Down:");
console.log("60 bits:", Number(hash60a) / 2**60); // Outputs 0.999...
console.log("53 bits:", Number(hash53a) / 2**53);
const hash60b = 0xFFFFFFFFFFFFFC0n; // First 54 bits are 1
const hash53b = hash60b >> 7n;
console.log("Test B, Rounding Up:");
console.log("60 bits:", Number(hash60b) / 2**60); // Outputs 1
console.log("53 bits:", Number(hash53b) / 2**53); Even though this demo relies on BigInt support because it made the code simpler, the rounding behaviour is functionally equivalent to what would happen with the current |
I have taken a look at the bucketUser implementations in Java, Go and PHP and they all make similar mistakes. The PHP implementation has slightly different number handling. The $LONG_SCALE variable is a signed 64 bit integer that is actually equal to 2^60-1. But when used in the division, PHP still converts it to a double precision floating point number, so the division is still equivalent to dividing by 2^60 and the result is the same as in JS. The Java and Go implementations explicitly only deal with 32 bit floats, yet still extract 60 bits from the hash. The mistakes are otherwise the same. But since floats only have 24 significant bits, they would return 1.0 if the first 25 bits of the hash are 1s, which is statistically more likely than the JS and PHP cases. i.e. if the hash is I think best solution is to decide how many bits of the hash all of the implementations should consume. Either 24 bits (for 32 bit floats) or 53 bits (for 64 bit floats). Then divide by either 2^24 or 2^53, respectively. In either case, the distributions will be uniform and range from 0.0 to the following:
|
As you say, the 53-bit issue not just for JS. However, again, if we are going to make any change that affects the behavior of the SDKs, we need to carefully consider how we are going to explain and roll that out, not just fix it in one SDK at a time. It is undesirable to have non-uniform bucketing, but it is also undesirable for customers to suddenly see any change in how users are bucketed just because they upgraded their SDK, especially if they are using multiple SDKs and don't upgrade them all at once. We have been considering a variety of changes to the bucketing algorithm, some of which would also address the 1.0 edge case you mentioned, and which would also cause changes to current bucketing in which case we would be more likely to go ahead and redo the algorithm from scratch. We will also probably want to do some further analysis to quantify exactly what the effect of the current behavior, and proposed alternatives, is on bucketing distribution. Thanks again for putting in the effort on this. |
add more end-to-end tests, improve HTTP test helpers, general cleanup
This issue is marked as stale because it has been open for 30 days without activity. Remove the stale label or comment, or this will be closed in 7 days. |
Is this a support request?
No.
Describe the bug
The algorithm used to convert a SHA-1 hash into a decimal value between 0 and 1 does not correctly take into account the range of safe integer values supported by 64 bit IEEE 754 floating point numbers as used by JavaScript. It tries to read 60 bits from a SHA1 hash value and convert that to a number with
parseInt
, and then divides by 2^60 to convert that to a decimal between 0 and 1.The code actually uses
0xFFFFFFFFFFFFFFF
. As written, this would be (2^60 - 1), but the loss of precision makes it equal to 2^60, so it just happens to work by accident.IEEE 754 numbers provide only 53 significant bits, allowing for a maximum of 2^53 - 1 safe positive integers. That is actually the value of Number.MAX_SAFE_INTEGER. Numbers lose precision beyond this range. This also happens to be the number of uniformly distributed values that can be represented between 0 and 1.
Thus, the correct approach to converting a hash to a decimal in that range is to consume 53 bits, convert it to a positive integer in the range from 0 to 2^53-1, and then divide by 2^53.
Fixing this bug requires care to minimise the risk of causing the bucketing algorithm to switch cohorts. The current approach effectively reads the first 53 bits, but due to the rounding logic when losing precision, can actually cause some bits to be lost from the end and make the distribution less uniform.
Thus, any fix for this needs to ensure that it also consumes the same first 53 bits of the hash.
The following is modified from the bucketUser function in evaluate_flag.js.
https://github.com/launchdarkly/node-server-sdk/blob/master/evaluate_flag.js#L353
The text was updated successfully, but these errors were encountered: