pool: Improve share UUID with random value. #390

davecgh · 2023-09-21T01:26:12Z

This is rebased on #389.

The postgres tests currently fail when running on fast systems due to the lack of a generating unique share IDs.

After digging into the failure a bit, the root cause is that the UUID used for the share id is not created with any randomness to it.

As the test failure demonstrates, it is generally not a good idea to rely solely on timestamps for generating supposedly unique values. While it is fairly unlikely that two shares will come in at the same nanosecond in most cases, there are a variety of factors that can lead to such a situation.

So, this appends a random value to the share UUID in order to help ensure a unique overall value and updates the callers accordingly. It also updates the database upgrade code to ensure the old share id format is used when performing the old version upgrade.

jholdstock · 2023-09-21T11:01:57Z

This looks fine overall, but its gotten me thinking.

Given that there are no known deployments of dcrpool at the moment I guess we can play a bit loose with it, but ideally we would probably want a brand new upgrade to move all existing shares to the new ID format.

But, if we are to assume there are no deployments and throw caution to the wind, we could probably just throw away all of upgrade code and start from scratch. Any thoughts?

davecgh · 2023-09-21T14:33:32Z

There really no reason to do an upgrade here imo. I called this out in the final commit message:

A new database upgrade is not needed because any existing entries are
necessarily already unique or they would've hit duplicate constraint
failures and there is not any code that explicitly relies on the key
being a specific length beyond the first 8 bytes being the timestamp.

Ideally, a v4 UUID would be created according to RFC 4122 instead,
however, that would be a much more invasive change since the current
bolt database code uses the UUID as a key and relies on the timestamp
being the first 8 bytes in the key for filtering purposes.

As mentioned, the only thing that cares about the format is the bolt code which looks at the first 8 bytes and I specifically didn't change them. Any future code that did care could very easily check the length.

davecgh · 2023-09-21T14:35:06Z

But, if we are to assume there are no deployments and throw caution to the wind, we could probably just throw away all of upgrade code and start from scratch. Any thoughts?

We could do that, but I don't see any reason to. I was careful not to break anything with this and old upgrades still work as intended.

This modifies the pool tests to avoid cleaning up the postgres data when there is a test failure and the -failfast test flag is set so the contents of the database can be examined. Since a failed run with -failfast will no longer purge the data, it also adds logic to purge the data prior to running the tests to ensure any previous failed runs are cleaned up on subsequent runs.

The postgres tests currently fail when running on fast systems due to the lack of a generating unique share IDs. After digging into the failure a bit, the root cause is that the supposed UUID used for the share id is not created with any randomness to it. As the test failure demonstrates, it is generally not a good idea to rely solely on timestamps for generating supposedly unique values. While it is fairly unlikely that two shares will come in at the same nanosecond in most cases, there are a variety of factors that can lead to such a situation. So, this appends a random value to the share UUID in order to help ensure a unique overall value and updates the callers accordingly. It also updates the database upgrade code to ensure the old share id format is used when performing the old version upgrade. A new database upgrade is not needed because any existing entries are necessarily already unique or they would've hit duplicate constraint failures and there is not any code that explicitly relies on the key being a specific length beyond the first 8 bytes being the timestamp. Ideally, a v4 UUID would be created according to RFC 4122 instead, however, that would be a much more invasive change since the current bolt database code uses the UUID as a key and relies on the timestamp being the first 8 bytes in the key for filtering purposes.

dajohi approved these changes Sep 21, 2023

View reviewed changes

davecgh added 2 commits September 21, 2023 11:26

davecgh force-pushed the rng_in_share_uuid branch from b5563cb to 1741cc7 Compare September 22, 2023 01:11

jholdstock merged commit 445e130 into decred:master Sep 22, 2023

davecgh deleted the rng_in_share_uuid branch September 22, 2023 08:56

jholdstock mentioned this pull request Sep 22, 2023

Revert bbolt database to version 1. #391

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pool: Improve share UUID with random value. #390

pool: Improve share UUID with random value. #390

davecgh commented Sep 21, 2023

jholdstock commented Sep 21, 2023

davecgh commented Sep 21, 2023

davecgh commented Sep 21, 2023

pool: Improve share UUID with random value. #390

pool: Improve share UUID with random value. #390

Conversation

davecgh commented Sep 21, 2023

jholdstock commented Sep 21, 2023

davecgh commented Sep 21, 2023

davecgh commented Sep 21, 2023