Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[5.8] Fix seeding logic in Arr::shuffle #27861

Closed
wants to merge 1 commit into from
Closed

Conversation

cviebrock
Copy link
Contributor

@cviebrock cviebrock commented Mar 12, 2019

The random comparator logic used -- return rand(-1, 1) -- is not really that random, as it's biased 2:1 in favour of not swapping the two elements.

values before random comparator result after
a, b -1 a, b
a, b 0 a, b
a, b 1 b, a

This can be illustrated with the following code, which checks what the first element of a "randomized" array is when using the same logic as Arr::shuffle() when a seed is passed (assuming the seeds in this case are randomized as well).

$k = [ 'a', 'b', 'c', 'd', 'e' ];
$r = array_fill_keys($k, 0);

for ($i=0; $i<1000000; $i++) {
    $array = $k;

    usort($array, function () {
        return rand(-1, 1);
    });

    $r[ $array[0] ] ++;
}

print_r($r);

Results:

Array
(
    [a] => 564137
    [b] => 105554
    [c] => 281362
    [d] => 36514
    [e] => 12433
)

So, when that random logic runs, more than half the time the shuffled array still has the first element in the first position.

Changing the logic to use return rand(0,1) helps a bit, but is still biased:

Array
(
    [a] => 307937
    [b] => 204875
    [c] => 308190
    [d] => 116441
    [e] => 62557
)

I think the basic fix for this is to simply use PHP's shuffle() which returns a much more evenly randomized array. If the randomization needs to be seeded, we can just set that before calling shuffle() (and then set it again afterwards, with a random seed, to prevent later logic in the code from being biased).

This PR does this.

@GrahamCampbell GrahamCampbell changed the title Fix seeding logic in Arr::shuffle [5.8] Fix seeding logic in Arr::shuffle Mar 12, 2019
@driesvints
Copy link
Member

Can you add a test to verify this?

@taylorotwell
Copy link
Member

Merged.

@cviebrock
Copy link
Contributor Author

I was going to ask what kind of test would make sense on a function that should produce random results. The only thing I could think of was making sure that shuffling the same array with the same seed produced the same result. Would that be useful @taylorotwell / @driesvints ... or are you happy with having just merged it?

@driesvints
Copy link
Member

@cviebrock this is fine for now.

@cviebrock cviebrock deleted the 5.8 branch June 6, 2019 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants