-
Notifications
You must be signed in to change notification settings - Fork 856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wasm Contracts: Implement Code Merkleization #122
Comments
I had a talk with @bkchr and we came to the conclusion that this should be possible. However, what was less clear is how that would be implemented in terms of APIs. I gave this some thought and wrote down a proposal on how this could be integrated into substrate. It also has a usage example which shows how I intend to use this from Please make sure to also read the doc comments where I put down some additional thoughts: |
After a chat with @gavofyork it became clear that runtime code should never have the power to introduce consensus errors. Allowing the runtime to include custom data into the PoV would introduce this new class of errors the the runtime. Instead, we should come up with a data structure implemented by the client that achieves our goal of not including all the data accessed by the collator into the PoV. |
Following up on my last comment: It is true that a collator can produce a block that is valid from its own point of view but not from the point of view of the PvF. This is one of Polkadot's main benefits: You don't need to put a lot of economic value behind collators when launching a chain. That said, this is generally not possible while using Cumulus (modulo bugs in its implementation). The point of using it is to make the split between block production and block validation transparent to the runtime author. Or in other words: While writing your runtime you get the PvF for free. They are symmetrical. That is the whole point of Cumulus. Code merkelization would require the introduction of asymmetry. In order to implement it we would need to do one of the following:
The downside of 1) is that it introduces a potential foot gun into FRAME + Cumulus: Users might use the feature in a wrong way and may produce invalid blocks by accident. However, I think this is largely okay and doesn't impact the overall FRAME developer experience:
BlockedMoving into blocked because more design discussion is needed. Also, there are alternatives for dealing with code size problems. Maybe we don't even need merkelization:
|
* Merge master * Adds pubkey to tx
No longer planned: Instead, we should allow storage of contract code on the relay chain for an additional deposit. In case proof size turns out to be a bottle neck. |
Glossary
(block_header, extrinsics, storage_proof)
Motivation
In the current
pallet-contracts
implementation the whole wasm binary of a contract needs to be loaded from storage into memory before it can be executed. Therefore anycall
transaction triggers all the code to be included in the proof of value when executed on a parachain. This is because right now the merkle proof that is sent from the collator to the validator is automatically recorded form the accessed storage items.One way of reducing the PoV size we want to explore is to opt out of this automatic recording for the (big) storage item that contains the contract code and instead provide a custom merkle proof containing only the accessed parts of the contract.
Description
We leave the actual storage as-is (contract stays in a single storage item) but record which parts of a contract are executed when running a contract call in the substrate runtime (as opposed to PVF). This information can then be used to stitch together a partial wasm module in the PVF (run by the validator). Note that this partial wasm module would be constructed in a way that it can be executed by any execution engine.
The code would be merkelized on deploy time by the substrate runtime according to our chunking strategy (one chunk per function to keep it simple for now). The root of that merkle tree would be put in regular storage and a proof against that root is put into the PoV for every call of that contract. The code is still stored as a single storage item. The merkle tree is only for the proof that is received by the PVF.
This approach has several advantages:
However, this can't be implemented with the current state of substrate/cumulus because any storage access of the substrate runtime is recorded and put into the PoV. One idea to resolve that is to give the runtime logic more control over the storage contents of the PoV: We think that we need exclude the access to the original monolithic wasm code entry from the automatically generated storage proof and instead include our custom proof. In order to do so cumulus would need to be modified to allow its users (the actual runtime, the pallets, etc...) to put custom data into the
ParachainBlockData
(which is essentially the PoV).This is a departure from the current design where the runtime logic is oblivious to the fact that it is running as a para chain. This might sound bad but it will enable use cases where pallets can provide data much more efficiently to the validators:
Progress
The text was updated successfully, but these errors were encountered: