Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix uniqid() performances #18232

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Fix uniqid() performances #18232

wants to merge 5 commits into from

Conversation

manu0401
Copy link
Contributor

@manu0401 manu0401 commented Apr 3, 2025

If available, use uuidgen() system call instead of a loop on gettimeofday(), that improves performances lot.

If available, use uuidgen() system call instead of a loop on
gettimeofday(), that improves performances lot.
@manu0401 manu0401 requested a review from bukka as a code owner April 3, 2025 00:43
@manu0401 manu0401 changed the title On some systems, uniqid() is critically slow. Fix uniqid() performances Apr 3, 2025
@@ -53,6 +87,24 @@ PHP_FUNCTION(uniqid)
Z_PARAM_BOOL(more_entropy)
ZEND_PARSE_PARAMETERS_END();

#ifdef __NetBSD__
struct uuid uuid;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused here. uuidgen is on FreeBSD/NetBSD (other oses have different apis) which you detect at configure time. Why the particular NetBSD code path is needed ? we should fall into the code above I think ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I do not really get the code duplication. But not sure it s worth the efforts to be honest considering later comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can this be even executed - it's in elif section. Basically:

#ifdef HAVE_UUIDGEN
...
#elif HAVE_GETTIMEOFDAY
...
#ifdef HAVE_UUIDGEN

That doesn't make any sense to me. Am I missing anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I committed it a bit too early, I should I removed that section that was used in earlier tests. It is fixed now.

@devnexen
Copy link
Member

devnexen commented Apr 3, 2025

Now that I remember, cpython used to use native calls for uuid generations, came back and forth to fix undesired behavior differences and ultimately decided to do one unique implementation in python.

before adding the uuidgen() test in configure
@TimWolla TimWolla self-requested a review April 3, 2025 06:53
@nielsdos
Copy link
Member

nielsdos commented Apr 3, 2025

Honestly, this function is so horrible that I wonder if we can't just replace the implementation with something entirely different but sane, given that deprecating it didn't pass.
I wonder if anyone relies on the documented behaviour that it's time based.

@TimWolla
Copy link
Member

TimWolla commented Apr 3, 2025

I wonder if anyone relies on the documented behaviour that it's time based.

As a weaker constraint that is also guaranteed by the time-based guarantee: The output is monotonically increasing, which is convenient for databases, because new values will not be distributed all over the place.


As I've also explained in my deprecation proposal (https://wiki.php.net/rfc/deprecations_php_8_4#deprecate_uniqid): Alternatives are already available that are better in every regard (except for “succinctness” perhaps) and changing the implementation will result in breaking changes for at least some of the users. Deprecating the function without intent of removal would have been the least impactful solution. Trying to make small incremental changes to uniqid() is just polishing a turd.

Copy link
Member

@TimWolla TimWolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the general remarks regarding the state of uniqid(), there are also issues with the proposed implementation.

In any case, I'm requesting an RFC being written to change the uniqid() implementation, due to the breaking changes with regard to documented behavior.

Comment on lines 97 to 100
if (more_entropy) {
n[1] &= 0xffffff;
uniqid = strpprintf(0, "%s%08x%06x.%08x", prefix, n[0], n[1], n[2]);
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This output format is incompatible with the non-uuidgen output. It also leaks parts of the hosts MAC address and also doesn't actually provide “more entropy”, since the mac address is fixed on a single node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is actually compatible, despite the fact I do not understand how gettimeofday+more entropy produces 14+8 characters, where I would expect 13+8 reading the code. Check it on your own:
$ php -r 'printf("%s\n", uniqid("", 1));'
67ee7f958a5b34.94453710

For the MAC address, I understand it is not included in UUID anymore:
$ uuidgen
bf0ff2ae-cf7a-46b1-bd2e-c3577268e311
$ uuidgen
39638058-e887-41a5-97d1-6451b0622763

But I can understand it may be platform dependent, and we must avoid using it in case it could contains the MAC.

manu0401 added 3 commits April 3, 2025 14:24
on some platforms. Reuse the more_entropy from the gettimeofday()
version instead.

Refactor to avoid code duplication.
@manu0401
Copy link
Contributor Author

manu0401 commented Apr 3, 2025

Some numbers to show how badly slow the original implementation can be, here on a NetBSD 10.0 amd64 Xen domU. First the original version, second the uuidgen() flavor. Note that this is 100 iterations versus 1M.

$ time php -r 'for ($i = 0; $i < 100; $i++) uniqid();'
19.10s real 6.13s user 6.39s system

$ time php -r 'for ($i = 0; $i < 1000000; $i++) uniqid();'
3.38s real 1.73s user 1.65s system

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants