Simplify LMDB resize #1804

Doy-lee · 2025-03-06T06:19:00Z

Removing of some unused headers. Enforcing the code defined by ENABLE_AUTO_RESIZE (that macro has been defined for 10 years and isn't ever going away) and also replacing the previous LMDB resize logic with something simpler resembling more like dynamic array growth (grow by 1GiB when 90% full always vs maybe check the last 100 blocks to estimate the size of a batch of blocks being written to the DB or use this other heuristic to grow the DB).

The meat of the changes:

The resize logic before this change was quite intricate with it reading
the previous window of blocks for the average size, then, applying some
fudge-factor ontop of it to try and overshoot the expected number of
bytes required so writing the objects to the DB doesn't fail.

It's actually still possible for this to fail if you had a batch of
blocks that were larger than the remaining space in the DB because
inherently the DB was unable to resize in the middle of a batch.

This is all-nice machinery that we don't necessarily need with some
sensible resizing logic:

1. We always increase the size of the DB by 1 GiB. Before this, we'd
enumerate the size of the payload and check if we overflowed, and then,
do some logic to choose a reasonable size.

Since we now only increase in 1 GiB increments, to ensure this doesn't break
we need to ensure that any blockchain payload cannot exceed 1 GiB. The
current default syncing window is 100 blocks per payload (and increasing
it above this is quite dicey as the node actually can't support much
higher than this without encountering some internal errors) and so for
1 GiB to be reasonable it must not be possible for a 100 block payload
to exceed 1 GiB. It's reasonable to assume that we won't have a window
of 100, 10MiB blocks without hitting some block weight limit or some
fatal exploit to bloat up the blocks.

Even if this were possible however, I've added an additional change which
ensures that in practice writing to the DB will always work as follows.

2. Allow a batch to be paused mid-write to resize the DB before continuing the
batch. This was as simple as stopping the batch before resize and
restarting the batch afterwards.

This now means that the _only_ way to break a write to the DB is to insert a
_single_ object that is larger than 1 GiB. The only individual item even
remotely close to this is the SNL blob. As of current with at most
2 years worth of 10k intervals and the last 60 SNL states, that consumes
~140 MiB which has plenty of leeway to grow.

As a result, assuming these assumptions we can drop the complicated
resize logic for a simpler resizing scheme.

--

Currently letting the node resync the chain from scratch to see if there are any issues. Will update when that is done successfully.

src/blockchain_db/lmdb/db_lmdb.cpp

ENABLE_AUTO_RESIZE was set 10 years ago and hasn't ever changed so we can remove the macro.

There's nothing different between ARM or x86_64 that I'm aware of that would necessitate having the default LMDB map size be different (1 << 30) vs (1 << 31). Even though we're decreasing the map size on ARM to (1 << 30), the LMDB is strictly growing so this will only affect newly created DBs.

The resize logic before this change was quite intricate with it reading the previous window of blocks for the average size, then, applying some fudge-factor ontop of it to try and overshoot the expected number of bytes required so writing the objects to the DB doesn't fail. It's actually still possible for this to fail if you had a batch of blocks that were larger than the remaining space in the DB because inherently the DB was unable to resize in the middle of a batch. This is all-nice machinery that we don't necessarily need with some sensible resizing logic: 1. We always increase the size of the DB by 1 GiB. Before this, we'd enumerate the size of the payload and check if we overflowed, and then, do some logic to choose a reasonable size. Since we now only increase in 1 GiB increments, to ensure this doesn't break we need to ensure that any blockchain payload cannot exceed 1 GiB. The current default syncing window is 100 blocks per payload (and increasing it above this is quite dicey as the node actually can't support much higher than this without encountering some internal errors) and so for 1 GiB to be reasonable it must not be possible for a 100 block payload to exceed 1 GiB. It's reasonable to assume that we won't have a window of 100, 10MiB blocks without hitting some block weight limit or some fatal exploit to bloat up the blocks. Even if this were possible however, I've added an additional change which ensures that in practice writing to the DB will always work as follows. 2. Allow a batch to be paused mid-write to resize the DB before continuing the batch. This was as simple as stopping the batch before resize and restarting the batch afterwards. This now means that the _only_ way to break a write to the DB is to insert a _single_ object that is larger than 1 GiB. The only individual item even remotely close to this is the SNL blob. As of current with at most 2 years worth of 10k intervals and the last 60 SNL states, that consumes ~140 MiB which has plenty of leeway to grow. As a result, assuming these assumptions we can drop the complicated resize logic for a simpler resizing scheme.

The previous logic was doing the incorrect operation to round up the requested map size to a size that is aligned with the page size. LMDB's set map size API has a note that the size passed in should always be a multiple of the DB page size.

Resizing on-demand breaks the transactional nature of the batch which means if you commit a block blob but fail for whatever reason to commit the TX (maybe IO error, or some other bug) then the DB will have the block but not the TXs. This is problematic because before this would have been reverted by aborting the batch. We can preserve that behaviour by moving the resize logic back to the start of the batch instead of on-demand when writing to the DB. This actually simplifies the code as resizing only happens in one spot. All writes to the DB are done when receiving a block over the P2P network so this is the only entrypoint for where writes are permitted into the DB (in other words, there's no other place that this resize check needs to happen).

Doy-lee requested a review from tewinget March 6, 2025 06:19

tewinget reviewed Mar 7, 2025

View reviewed changes

src/blockchain_db/lmdb/db_lmdb.cpp Outdated Show resolved Hide resolved

jagerman approved these changes Mar 11, 2025

View reviewed changes

Doy-lee added 9 commits March 12, 2025 13:28

Enforce LMDB 'ENABLE_AUTO_RESIZE' branches

989fad5

ENABLE_AUTO_RESIZE was set 10 years ago and hasn't ever changed so we can remove the macro.

Remove unused headers in ringct

ce21701

Remove unused typedef and comments in rcttypes

12f77ee

Remove unused running block size/count in LMDB

e2a3a2a

Linting pass for simplify LMDB resize

becdce8

Doy-lee force-pushed the doyle-simplify-lmdb-resize branch from fb38806 to 3f5e65f Compare March 12, 2025 02:28

Linting pass for simplify LMDB resize code

29e1910

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify LMDB resize #1804

Simplify LMDB resize #1804

Doy-lee commented Mar 6, 2025 •

edited

Loading

Simplify LMDB resize #1804

Are you sure you want to change the base?

Simplify LMDB resize #1804

Conversation

Doy-lee commented Mar 6, 2025 • edited Loading

Doy-lee commented Mar 6, 2025 •

edited

Loading