Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wasm Contracts: Implement Code Merkleization #122

Closed
1 of 4 tasks
Tracked by #9354
athei opened this issue Jul 25, 2021 · 4 comments
Closed
1 of 4 tasks
Tracked by #9354

Wasm Contracts: Implement Code Merkleization #122

athei opened this issue Jul 25, 2021 · 4 comments
Labels
I6-meta A specific issue for grouping tasks or bugs of a specific category. I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task.

Comments

@athei
Copy link
Member

athei commented Jul 25, 2021

Glossary

  • Substrate Runtime: The execution path of the cumulus derived runtime that runs on the parachain nodes (full node, collator).
  • PVF: The execution path of the cumulus derived runtime that runs on the relay chain validators.
  • PoV: Data passed by collators to the validators for validation. For our case we can assume that to be (block_header, extrinsics, storage_proof)

Motivation

In the current pallet-contracts implementation the whole wasm binary of a contract needs to be loaded from storage into memory before it can be executed. Therefore any call transaction triggers all the code to be included in the proof of value when executed on a parachain. This is because right now the merkle proof that is sent from the collator to the validator is automatically recorded form the accessed storage items.

One way of reducing the PoV size we want to explore is to opt out of this automatic recording for the (big) storage item that contains the contract code and instead provide a custom merkle proof containing only the accessed parts of the contract.

Description

We leave the actual storage as-is (contract stays in a single storage item) but record which parts of a contract are executed when running a contract call in the substrate runtime (as opposed to PVF). This information can then be used to stitch together a partial wasm module in the PVF (run by the validator). Note that this partial wasm module would be constructed in a way that it can be executed by any execution engine.

The code would be merkelized on deploy time by the substrate runtime according to our chunking strategy (one chunk per function to keep it simple for now). The root of that merkle tree would be put in regular storage and a proof against that root is put into the PoV for every call of that contract. The code is still stored as a single storage item. The merkle tree is only for the proof that is received by the PVF.

This approach has several advantages:

  • No need to add changes or complexity to execution engines
    • Coverage collection (selecting which chunks are used) will be done by instrumentation (not in the execution engine)
    • wasmer, wasmi, ... can be used as-is
  • No additional storage access overhead in the substrate runtime since code stays in a single storage item
  • Easier storage migration when chunking strategy or tree flavor changes

However, this can't be implemented with the current state of substrate/cumulus because any storage access of the substrate runtime is recorded and put into the PoV. One idea to resolve that is to give the runtime logic more control over the storage contents of the PoV: We think that we need exclude the access to the original monolithic wasm code entry from the automatically generated storage proof and instead include our custom proof. In order to do so cumulus would need to be modified to allow its users (the actual runtime, the pallets, etc...) to put custom data into the ParachainBlockData (which is essentially the PoV).

This is a departure from the current design where the runtime logic is oblivious to the fact that it is running as a para chain. This might sound bad but it will enable use cases where pallets can provide data much more efficiently to the validators:

  • No need to put data in storage just to send it to the validator.
  • Using proof systems that are specifically tailored to the use case.

Progress

  • Figure out the proper chunking strategy by writing instrumentation/tooling that can be used to observe the execution of contracts Add contracts coverage tracing substrate#9481: We use one chunk per function as a MVP because we can change this later.
  • Figure out a way to change CUMULUS and FRAME to include a custom proof.
  • Figure out which trie type and node key strategy minimize the merkle witness.
  • Implement code merkelization in pallet_contracts.
@athei
Copy link
Member Author

athei commented Oct 6, 2021

I had a talk with @bkchr and we came to the conclusion that this should be possible. However, what was less clear is how that would be implemented in terms of APIs. I gave this some thought and wrote down a proposal on how this could be integrated into substrate. It also has a usage example which shows how I intend to use this from pallet_contracts.

Please make sure to also read the doc comments where I put down some additional thoughts:
https://gist.github.com/athei/5df72bc02c44f342338fdb66b2269619

@athei athei added I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task. Z5-epic labels Nov 9, 2021
@athei
Copy link
Member Author

athei commented Jan 3, 2022

After a chat with @gavofyork it became clear that runtime code should never have the power to introduce consensus errors. Allowing the runtime to include custom data into the PoV would introduce this new class of errors the the runtime.

Instead, we should come up with a data structure implemented by the client that achieves our goal of not including all the data accessed by the collator into the PoV.

@athei
Copy link
Member Author

athei commented Mar 19, 2023

Following up on my last comment: It is true that a collator can produce a block that is valid from its own point of view but not from the point of view of the PvF. This is one of Polkadot's main benefits: You don't need to put a lot of economic value behind collators when launching a chain.

That said, this is generally not possible while using Cumulus (modulo bugs in its implementation). The point of using it is to make the split between block production and block validation transparent to the runtime author. Or in other words: While writing your runtime you get the PvF for free. They are symmetrical. That is the whole point of Cumulus.

Code merkelization would require the introduction of asymmetry. In order to implement it we would need to do one of the following:

  1. Modify Cumulus to allow for asymmetry. For example, allowing to add custom data to the PoV in block production and processing this data while validation. Right now differences between production and validation is handled transparently by the client.

  2. Abstract that asymmetry away through host functions (doubling down on the Cumulus approach).

  3. Don't use FRAME nor Cumulus for pallet-contracts but build directly on top of the Polkadot protocol.

  4. is a non starter because integration into FRAME is one of the big selling points of pallet-contracts. That leaves as with 1) and 2). My stance on this is that if we can accomplish a feature in user space (runtime) we should do that. As adding new host function will add an additional burden for each and every new client implementation.

The downside of 1) is that it introduces a potential foot gun into FRAME + Cumulus: Users might use the feature in a wrong way and may produce invalid blocks by accident. However, I think this is largely okay and doesn't impact the overall FRAME developer experience:

  1. It will be an isolated and opt-in feature only to be used when necessary. We will offer proper abstractions in FRAME/Cumulus for it that limits the asymmetric code to well defined blocks (akin to Rust's unsafe).
  2. While potentially buggy the user written PvF code will still be deterministic over its inputs because this is how we will design our abstractions.
  3. It will open the door for experimentation with other asymmetric solutions on top of it (like zero knowledge proofs). If we don't allow for asymmetry on top of FRAME this requires writing a runtime from scratch or forking Cumulus. This is hard and builders that want to do something like this might turn elsewhere. It is great that the Polkadot protocol is flexible enough to allow for all kinds of parachains. However, the hurdles to actually leverage that are quite high as of today. Allowing to do that with FRAME would make this drastically more approachable.

Blocked

Moving into blocked because more design discussion is needed. Also, there are alternatives for dealing with code size problems. Maybe we don't even need merkelization:

  • Storing popular contracts at the relay chain for a deposit. Can't find the issue for that (I think there is one).
  • Contract authors can decompose their contracts into pieces and call them using delegate_call as library. This is popular on EVM to get around the 24KB size limit.

@athei athei transferred this issue from paritytech/substrate Aug 24, 2023
@the-right-joyce the-right-joyce added I6-meta A specific issue for grouping tasks or bugs of a specific category. and removed Z5-epic labels Aug 25, 2023
claravanstaden referenced this issue in Snowfork/polkadot-sdk Dec 8, 2023
helin6 pushed a commit to boolnetwork/polkadot-sdk that referenced this issue Feb 5, 2024
* Merge master

* Adds pubkey to tx
@athei
Copy link
Member Author

athei commented Nov 28, 2024

No longer planned: Instead, we should allow storage of contract code on the relay chain for an additional deposit. In case proof size turns out to be a bottle neck.

@athei athei closed this as not planned Won't fix, can't repro, duplicate, stale Nov 28, 2024
@github-project-automation github-project-automation bot moved this from Blocked ⛔️ to Done ✅ in Smart Contracts Nov 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I6-meta A specific issue for grouping tasks or bugs of a specific category. I9-optimisation An enhancement to provide better overall performance in terms of time-to-completion for a task.
Projects
Status: No status
Development

No branches or pull requests

2 participants