-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make runtime_host
operate over the entire trie rather than only over the storage entries of the trie
#639
Conversation
Thanks for the headsup. Will keep an eye on this. |
The state root mismatch is still there. This means that there would be a bug in the code in |
The full node now works, but has become insanely slow due to the removal of the cache. Verifying Polkadot block 1 takes 237 seconds on my machine for example. Blocks 2 to 6 take on average 350 seconds each. I believe that before this PR this isn't so much a problem because we read all the keys at the same time with a single database call. Most the slowness might be caused by the fact that SQLite has to re-parse our SQL at every single call. Unfortunately, it's not possible to prepare statements in advance because the library we use has unnecessarily strict lifetime restrictions. cc #632 |
I'm still going to go ahead with this PR even if it "breaks" the full node, because #272 is more important to fix. |
Context
The
runtime_host
module calls the Wasm runtime. When this Wasm runtime wants to access a storage entry aStorageGet
is generated, and when this Wasm runtime iterates over keys, aStorageNextKey
is generated.In parallel of the execution, the
runtime_host
maintains a "trie cache" that contains the structure of the storage trie. When the runtime writes onto the storage, theruntime_host
updates the structure contained in this cache and invalidates the hash of the trie nodes that are cached.At the end of the operation, the
runtime_host
looks into this trie structure and recalculates all the hashes that have been invalidated.The user has the possibility to not provide a cache, in which case the
runtime_host
will iterate over the entire trie structure (using aStoragePrefixKeys
) in order to populate it.There are a few problems with this system:
The trie structure can in theory be very big, and you're not supposed to store it entirely in memory. I can't easily measure its size at the head of the Polkadot chain, but if the trie continues to grow smoldot can eventually run out of memory. While this isn't a problem in practice yet and while this PR doesn't solve that problem entirely (we maintain a list of changes during the execution), it is a step towards a solution.
If no cache is provided, a
StoragePrefixKeys
is generated with a prefix of&[]
in order to fetch the list of every single key in the trie. If the execution is performed by a light client, it is not possible to know the list of all keys, and we have no choice but to fail the execution. This is the error that happens in RPC call returns "PrefixKeysForbidden" error code #272.In the context of a full node, when we're at the head of the chain we might have to verify blocks whose parents aren't the best block. This means that we have to maintain a copy of this cache (and thus of the entire trie structure) for every block that can potentially have children (meaning: the finalized block and every single descendant). If finality stalls for some reason, there can be several thousand such blocks, and thus several thousand copies of the trie structure, which can be quite disastrous.
The cache is not adapted to child tries support (Implement support for child tries #166).
What the PR does
This PR refactors this system by asking the user of the
runtime_host
(and thus all the layers on top of it) to return information about the trie structure rather than only about the storage values. In other words, it is the responsibility of the API user of theruntime_host
to maintain the trie structure.The precise API changes are as follows:
main_trie_calculation_cache
has been removed everywhere.NextKey
now provides akey()
andprefix()
as a list of nibbles rather than a list of bytes. The value that is provided back must also be a list of a nibbles rather than a list of bytes.NextKey
now has abranch_nodes()
function. If it returnstrue
, then the API user must return the next trie node (no matter whether it is a branch node or a storage node). If it returnsfalse
, then the API user must return the next trie node that contains a storage value (i.e. the behavior before this PR).PrefixKeys
variant has been removed entirely.MerkleValue
variant that requires the API user to provide the Merkle value of a trie node that was previously provided toNextKey
. The API user can useresume_unknown
in order to indicate that it doesn't know this Merkle value and that it must be recalculated no matter what.In the future, another change I'd like to make is: once the runtime call is successful, the list of all changes to the trie structure should be provided (including the branch nodes that have been inserted or deleted), rather than just the changes to the storage entries.
The
consensus_service
is directly impacted by this change. However, it doesn't store the trie structure in the database (as that change would be too big for one PR) but instead dynamically determines what the next branch node is by performing multiple database lookups.I have considered adding a system to the
runtime_host
so that it does internally what theconsensus_service
does. For example, there could be a configuration option calledbranch_nodes_unknown
which, when set totrue
, would guarantee thatNextKey::branch_nodes()
is alwaysfalse
. However, I think that this would make a weird API especially w.r.t. the future change I mention where the diff of the trie structure is provided as well on success.cc @xlc w.r.t. Chopsticks
I'm opening this as a draft PR in order to make you aware of this change and maybe gather an opinion.I think that you should be able to update Chopsticks to use the same system as whatconsensus_service.rs
does in this diff, which is to dynamically determine the branch nodes.This PR is currently a draft and it's failing due to a storage root mismatch (in other words there's a bug somewhere).cc #272 #126