Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: state modifier for testing #18706

Closed
tac0turtle opened this issue Dec 12, 2023 · 12 comments · Fixed by #19280
Closed

[Feature]: state modifier for testing #18706

tac0turtle opened this issue Dec 12, 2023 · 12 comments · Fixed by #19280

Comments

@tac0turtle
Copy link
Member

tac0turtle commented Dec 12, 2023

Summary

Today, if a chain would like to use their mainnet state to run tests on, they need to:

  • export their state to genesis (requires large machines)
  • modify the genesis file
  • import the genesis file (requires a large machine)

These steps are heavy in nature due to how we keep all genesis in memory until the end.

We should offer a simpler approach in which an application dev writes a state transition in a module, or outside in which it modifies all state to have the state they need to start the chain.

Problem Definition

Export and Importing of state is a large burden due to resource contraints

Proposed Feature

I see two options, but there may be more:

  1. We write state transition functions that are behind a build flag in which it modifies state in accordance with the needs of the testing environment.
  • The simplest state transition we know we need is moving all validator power to a single address in order to be able to restart the chain. We can make the input an array of addresses that will get the voting power equally distributed to, if there is a need for many nodes.
  1. We write a tool, that imports certain modules and performs the state transition. This would need us to create the store, input the keys for a chains using the multistore. By default we would offer the validator state transition and possibly bank addresses to new addresses in order to have funds on chain.

I tend to like two more since its separate from the main binary and can be distributed separately in the case of disaster recovery

To test, we should be able to take the cosmos hub mainnet database modify it to be a single validator and start the chain.

Its unclear how this will work with cometbft's database, we may need to modify their state as well if initchain doesnt do it all

@github-project-automation github-project-automation bot moved this to 👀 To Do in Cosmos-SDK Dec 12, 2023
@tac0turtle tac0turtle moved this from 👀 To Do to 🧑‍🔧 Needs Design in Cosmos-SDK Dec 12, 2023
@cool-develope
Copy link
Contributor

cool-develope commented Dec 12, 2023

It would be great to provide a similar feature to the hardhat fork, dumping full data requires enormous works.
Maybe we can customize the state sync for the specific module and accounts.

@alexanderbez
Copy link
Contributor

What I've seen so far, is teams will write Python or similar scripts to achieve this. However, such an approach still requires to load the entire genesis into memory, which might be a no go. So if you could stream the genesis to load the exact module you want into memory, that would help a lot. From there, you can modify that module's state in memory and then write it back to the genesis file.

@tac0turtle
Copy link
Member Author

So if you could stream the genesis to load the exact module you want into memory, that would help a lot. From there, you can modify that module's state in memory and then write it back to the genesis file.

why not modify it on disk? this way we avoid the need for streaming out and then back in?

@alexanderbez
Copy link
Contributor

So if you could stream the genesis to load the exact module you want into memory, that would help a lot. From there, you can modify that module's state in memory and then write it back to the genesis file.

why not modify it on disk? this way we avoid the need for streaming out and then back in?

Maybe I'm missing something, but how do you modify anything w/o loading it in memory? I imagine any valuable state transition function will be dependent in some input.

@cool-develope
Copy link
Contributor

cool-develope commented Dec 12, 2023

from my understanding, the state transition is a kind of streaming with the specific flags ?
So, we can add some extra configuration to the state streaming, this way we don't need the extra tools or modules, at least within SDK. Then maybe we can create the extra script (whatever Python) outside of SDK.

@tac0turtle
Copy link
Member Author

the idea i had is you write a state transition function and execute it directly on the database instead of loading anything into memory. there wouldnt be a need to extract the data from the database, maybe store loads it into cache but would be the same way we load things during execution

@kocubinski
Copy link
Member

kocubinski commented Dec 12, 2023

Today, if a chain would like to use their mainnet state to run tests on, they need to:

export their state to genesis (requires large machines)
modify the genesis file
import the genesis file (requires a large machine)

Instead of syncing from genesis, I have been able to run osmosis mainnet tests by downloading a snapshot (https://quicksync.io) and syncing from there. That felt reasonable, but did require some manual work. For a chain that's been through multiple state-machine breaking upgrades and binaries, it is very difficult (today) to sync from genesis because of the need to collect many legacy binaries right? If that's true I think I agree with @cool-develope that streamlining state sync is a good direction, for testing non state breaking changes.

  1. We write state transition functions that are behind a build flag in which it modifies state in accordance with the needs of the testing environment.

This and the points below it seem to be pointing at a way of streamlining forking a new testnet from mainnet head, am I understanding that right?

@tac0turtle
Copy link
Member Author

the testing used here is when users like osmosis and gaia export genesis modify genesis then start a local testnet to test upgrades or other things. Its a bit different than our use case of testing.

Im not sure what state streaming would add here? we would need to export all state into the file system, find the file which has the staking values, modify them, then import them back into a node. Is that what you mean @cool-develope

@cool-develope
Copy link
Contributor

cool-develope commented Dec 12, 2023

personally, I like what hardhat fork is doing, it doesn't need the entire state. so it would be great the SDK provides some streaming or API to get the specific storage data or the minimum state for testing.

@cool-develope
Copy link
Contributor

the testing used here is when users like osmosis and gaia export genesis modify genesis then start a local testnet to test upgrades or other things. Its a bit different than our use case of testing.

Maybe, I misunderstand your point, should we need the full state to test the local testnet and upgrades?

@tac0turtle
Copy link
Member Author

the testing used here is when users like osmosis and gaia export genesis modify genesis then start a local testnet to test upgrades or other things. Its a bit different than our use case of testing.

Maybe, I misunderstand your point, should we need the full state to test the local testnet and upgrades?

correct, the goal is to be able to take a snapshot of mainnet, modify state, then run that locally. This is how much of the ecosystem is doing chain upgrade testing today, except they export into genesis, modify then import

@alexanderbez
Copy link
Contributor

the idea i had is you write a state transition function and execute it directly on the database instead of loading anything into memory.

I can't imagine any meaningful or useful state transition function w/o any dependent input, which must be loaded into memory. Now certainly you can blindly overwrite keys with some data, in which case no input is required. But I would argue that is only a small subset of the total set of state transition functions.

E.g. consider the example where you want a single account to have the balance of all other accounts. This requires you loading some or all balances into memory.

@github-project-automation github-project-automation bot moved this from 🧑‍🔧 Needs Design to 🥳 Done in Cosmos-SDK Feb 12, 2024
@tac0turtle tac0turtle removed this from Cosmos-SDK Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants