forked from monero-project/monero
-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify LMDB resize #1804
Open
Doy-lee
wants to merge
10
commits into
oxen-io:dev
Choose a base branch
from
Doy-lee:doyle-simplify-lmdb-resize
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Simplify LMDB resize #1804
+99
−289
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
tewinget
reviewed
Mar 7, 2025
jagerman
approved these changes
Mar 11, 2025
ENABLE_AUTO_RESIZE was set 10 years ago and hasn't ever changed so we can remove the macro.
There's nothing different between ARM or x86_64 that I'm aware of that would necessitate having the default LMDB map size be different (1 << 30) vs (1 << 31). Even though we're decreasing the map size on ARM to (1 << 30), the LMDB is strictly growing so this will only affect newly created DBs.
The resize logic before this change was quite intricate with it reading the previous window of blocks for the average size, then, applying some fudge-factor ontop of it to try and overshoot the expected number of bytes required so writing the objects to the DB doesn't fail. It's actually still possible for this to fail if you had a batch of blocks that were larger than the remaining space in the DB because inherently the DB was unable to resize in the middle of a batch. This is all-nice machinery that we don't necessarily need with some sensible resizing logic: 1. We always increase the size of the DB by 1 GiB. Before this, we'd enumerate the size of the payload and check if we overflowed, and then, do some logic to choose a reasonable size. Since we now only increase in 1 GiB increments, to ensure this doesn't break we need to ensure that any blockchain payload cannot exceed 1 GiB. The current default syncing window is 100 blocks per payload (and increasing it above this is quite dicey as the node actually can't support much higher than this without encountering some internal errors) and so for 1 GiB to be reasonable it must not be possible for a 100 block payload to exceed 1 GiB. It's reasonable to assume that we won't have a window of 100, 10MiB blocks without hitting some block weight limit or some fatal exploit to bloat up the blocks. Even if this were possible however, I've added an additional change which ensures that in practice writing to the DB will always work as follows. 2. Allow a batch to be paused mid-write to resize the DB before continuing the batch. This was as simple as stopping the batch before resize and restarting the batch afterwards. This now means that the _only_ way to break a write to the DB is to insert a _single_ object that is larger than 1 GiB. The only individual item even remotely close to this is the SNL blob. As of current with at most 2 years worth of 10k intervals and the last 60 SNL states, that consumes ~140 MiB which has plenty of leeway to grow. As a result, assuming these assumptions we can drop the complicated resize logic for a simpler resizing scheme.
The previous logic was doing the incorrect operation to round up the requested map size to a size that is aligned with the page size. LMDB's set map size API has a note that the size passed in should always be a multiple of the DB page size.
Resizing on-demand breaks the transactional nature of the batch which means if you commit a block blob but fail for whatever reason to commit the TX (maybe IO error, or some other bug) then the DB will have the block but not the TXs. This is problematic because before this would have been reverted by aborting the batch. We can preserve that behaviour by moving the resize logic back to the start of the batch instead of on-demand when writing to the DB. This actually simplifies the code as resizing only happens in one spot. All writes to the DB are done when receiving a block over the P2P network so this is the only entrypoint for where writes are permitted into the DB (in other words, there's no other place that this resize check needs to happen).
fb38806
to
3f5e65f
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Removing of some unused headers. Enforcing the code defined by ENABLE_AUTO_RESIZE (that macro has been defined for 10 years and isn't ever going away) and also replacing the previous LMDB resize logic with something simpler resembling more like dynamic array growth (grow by 1GiB when 90% full always vs maybe check the last 100 blocks to estimate the size of a batch of blocks being written to the DB or use this other heuristic to grow the DB).
The meat of the changes:
--
Currently letting the node resync the chain from scratch to see if there are any issues. Will update when that is done successfully.