Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RPC depends on a restart to update the ForkID information #3172

Closed
tclemos opened this issue Jan 30, 2024 · 4 comments
Closed

RPC depends on a restart to update the ForkID information #3172

tclemos opened this issue Jan 30, 2024 · 4 comments

Comments

@tclemos
Copy link
Contributor

tclemos commented Jan 30, 2024

The current implementation of the node depends on the State component to allow other components to share information.

These 3 components depend on the ForkID information:

  • Sequencer: to know how to process the new txs
  • Synchronizer: to know how to process the block being synchronized
  • RPC: to be able to reprocess/debug txs

The ForkID information source of truth is stored on L1 and it's synchronized by the Synchronizer every time a ForkID update event is detected, the information is updated in the State and the updated information can be read by any other component.

Since this event doesn't happen very often and we have 100% control over when this is going to happen, the information is loaded during the node initialization and it's kept in memory.

When running components isolated, each component will have it's own State instance, which means any data stored in the State memory can be different between components.

In the case of an environment running the RPC and the Synchronizer separately, even though the Synchronizer can update the ForkID information in the State, the RPC is never notified that the ForkID was updated, and since the node loads the ForkID information only during the initialization, it makes the ForkID information to be outdated.

Currently, the RPC requires a restart in order to be able to read the new ForkID information.

Finally and just to clarify, the Sequencer also doesn't get the updated ForkID information when synchronized, but since the Sequencer is still running in a centralized manner, the ForkID updates affects it directly and we have 100% control over it, it's ok to restart it every time a new ForkID information is available, because this update is a very well controlled update of the environment by the team. The problem we need to address in this issue is the fact that the RPC and the Synchronizer are the components available in the Permissionless nodes.

Proposal 1:

Removes the memory cache and query the State every time the information is needed.

  • pros: easy to implement, consistent
  • cons: performance decrease

Proposal 2:

Invalidates the cache in memory periodically

  • pros: easy to implement
  • cons: inconsistency

Proposal 3:

Implements a message broker protocol in order to allow components to communicate between them.

  • pros: perfect solution when talking about the whole node
  • cons: hard to implement due to all the things we need to adjust and define before implementing it
@tclemos tclemos added this to the v0.6.0 milestone Jan 30, 2024
@tclemos tclemos self-assigned this Jan 30, 2024
@agnusmor
Copy link
Contributor

Hi, as you comment the sequencer really is not affected for this, as we will have control in the sequencer each time we have a forkid upgrade. In my case I think the best solution is the proposal 2. The only issue with this proposal is that the RPC can take some seconds to be notified about the new forkid, this can cause some issues if some one do debug_trace requests before the RPC is "updated" about the new forkid (forkid 7). But I understand that the thirdparty system that is doing the debug_Trace request (for example etherscan) will retry it. Also as the forkid table is small and fast to load its contents, we can update the memory cache each few seconds (couldbe 5 seconds by default) doing this in a go func and using a RWMutex to reduce the contention when only reading the forkid cache

@tclemos
Copy link
Contributor Author

tclemos commented Jan 31, 2024

But if we need to "update" this information in memory every 5 seconds against the information that is in the DB, does it really make sense to keep it in memory? Wouldn't it be simple to just query the DB when this information is needed as we do to all the rest of the data we have?

As mentioned, the table is short and the query should work pretty fast, querying the DB avoids the "inconsistency momento" and should have a performance close to the solution that refreshes it every 5 second.

It seems to me that the Proposal 1 offers the benefits we want for a better price than the Proposal 2, because if we need to update this data too often, I would also add the performance decrease in the cons for Proposal 2.

@joanestebanr
Copy link
Contributor

Another options is to use postgres triggers to reload when something is inserted or updated

@tclemos
Copy link
Contributor Author

tclemos commented Feb 2, 2024

Another options is to use postgres triggers to reload when something is inserted or updated

Coupling it to the DB doesn't seem to be a good idea to me, since we plan to change the db in the future. Also it would have to trigger an event on the application layer, not in the DB layer. I prefer to avoid crazy integrations like this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants