Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

synchronizer: panic runtime error syncing batches from trusted node #3581

Closed
joanestebanr opened this issue Apr 23, 2024 · 0 comments · Fixed by #3586
Closed

synchronizer: panic runtime error syncing batches from trusted node #3581

joanestebanr opened this issue Apr 23, 2024 · 0 comments · Fixed by #3586

Comments

@joanestebanr
Copy link
Contributor

joanestebanr commented Apr 23, 2024

System information

zkEVM Node version: v0.6.7-RC12
Network: Internal

Expected behaviour

Don't panic

Actual behaviour

panic: runtime error: invalid memory address or nil pointer dereference

Steps to reproduce the behaviour

  • The logs show that the variable batchToSync is nil in file: synchronizer/l2_sync/l2_shared/trusted_batches_retrieve.go function syncTrustedBatchesToFrom.
  • This can only happens if the zkEVMClient is returning a nil and no error. That is pointing that something go wrong in the RPC client.

The idea is to protect code against this possible error from RPC checking that is not nil, and a minimum sanity check over the returned batch.

Backtrace

[backtrace]
Apr 22 13:00:45.191
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
main.runSynchronizer({0x0, 0x0, 0x0, {{0xc000783260, 0xa}, {0xc000783270, 0x4}, {0xc0004447d0, 0x1, 0x1}}, ...}, ...)

Apr 22 13:00:45.191
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
/src/synchronizer/synchronizer.go:421 +0x1817

Apr 22 13:00:45.191
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
github.com/0xPolygonHermez/zkevm-node/synchronizer.(*ClientSynchronizer).Sync(0xc0005ed180)

Apr 22 13:00:45.191
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
/src/synchronizer/synchronizer.go:766

Apr 22 13:00:45.191
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
github.com/0xPolygonHermez/zkevm-node/synchronizer.(*ClientSynchronizer).syncTrustedState(...)

Apr 22 13:00:45.191
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
/src/synchronizer/l2_sync/l2_shared/processor_trusted_batch_selector.go:67 +0x51

Apr 22 13:00:45.190
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
github.com/0xPolygonHermez/zkevm-node/synchronizer/l2_sync/l2_shared.(*SyncTrustedStateExecutorSelector).SyncTrustedState(0xc000c209d8?, {0x1d79d90, 0xc0005c6910}, 0x3?, 0x3?)

Apr 22 13:00:45.190
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
/src/synchronizer/l2_sync/l2_shared/trusted_batches_retrieve.go:102 +0x27a

Apr 22 13:00:45.190
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
github.com/0xPolygonHermez/zkevm-node/synchronizer/l2_sync/l2_shared.(*TrustedBatchesRetrieve).SyncTrustedState(0xc0005c6cd0, {0x1d79d90, 0xc0005c6910}, 0x4def, 0xffffffffffffffff)

Apr 22 13:00:45.190
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
/src/synchronizer/l2_sync/l2_shared/trusted_batches_retrieve.go:136 +0x3e7

Apr 22 13:00:45.190
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
github.com/0xPolygonHermez/zkevm-node/synchronizer/l2_sync/l2_shared.(*TrustedBatchesRetrieve).syncTrustedBatchesToFrom(0xc0005c6cd0, {0x1d79d90, 0xc0005c6910}, 0x4def, 0x4df1)

Apr 22 13:00:45.190
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
/src/synchronizer/l2_sync/l2_shared/processor_trusted_batch_sync.go:174 +0xab

Apr 22 13:00:45.190
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
github.com/0xPolygonHermez/zkevm-node/synchronizer/l2_sync/l2_shared.(*ProcessorTrustedBatchSync).ProcessTrustedBatch(0xc0005c6c80, {0x1d79d90, 0xc0005c6910}, 0x0, {{0xc00004a1d0, 0x2, 0x2}}, {0x1d852c0, 0xc001fb6930}, {0xc000beec00, ...})

Apr 22 13:00:45.190
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
goroutine 355 [running]:

Apr 22 13:00:45.190
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1478b2b]

Apr 22 13:00:45.190
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
panic: runtime error: invalid memory address or nil pointer dereference

Apr 22 13:00:45.190
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
gke-zkevm-internal-pool-large-1d09cab3-fvmt.europe-west2-b.c.prj-polygonlabs-zkevm-test.internal
zkevm-internal
Pending Flushid fullfiled: 4395, executor have write 4395

When submitting logs: please submit them as text and not screenshots.

@joanestebanr joanestebanr added this to the v0.6.6 milestone Apr 23, 2024
@joanestebanr joanestebanr self-assigned this Apr 23, 2024
joanestebanr added a commit that referenced this issue Apr 23, 2024
…om Trusted after a open batch (#3586)

* synchronized: #3583  stop sync from l2 after no closed batch (#3584)
* fix #3581 synchronizer panic synchronizing from trusted node (#3582)
Stefan-Ethernal pushed a commit to 0xPolygon/cdk-validium-node that referenced this issue Apr 25, 2024
Stefan-Ethernal pushed a commit to 0xPolygon/cdk-validium-node that referenced this issue May 21, 2024
Stefan-Ethernal added a commit to 0xPolygon/cdk-validium-node that referenced this issue May 22, 2024
* check GER and index of synced L1InfoRoot matches with sc values (0xPolygonHermez#3551)

* apply txIndex fix to StoreTransactions; add migration to fix wrong txIndexes (0xPolygonHermez#3556)

* Feature/0xPolygonHermez#3549 reorgs improvement (0xPolygonHermez#3553)

* New reorg function

* mocks

* linter

* Synchronizer tests

* new elderberry smc docker image

* new image

* logs

* fix json rpc

* fix

* Test sync from empty block

* Regular reorg case tested

* linter

* remove empty block + fix LatestSyncedBlockEmpty

* Improve check reorgs when no block is received during the call

* fix RPC error code for eth_estimateGas and eth_call for reverted tx and no return value; fix e2e test;

* fix test

* Extra unit test

* fix reorg until genesis

* disable parallel synchronization

---------

Co-authored-by: tclemos <[email protected]>

* Fix adding tx that matches with tx that is being processed (0xPolygonHermez#3559)

* fix adding  tx that matches (same addr and nonce) tx that is being processing

* fix generate mocks

* fix updateCurrentNonceBalance

* synchronizer:  check l1blocks (0xPolygonHermez#3546)

* wip

* run on background L1block checker

* fix lint and documentation

* fix conflict

* add unittest

* more unittest

* fix lint

* increase timeout for async unittest

* fix unittest

* rename GetResponse for GetResult and fix uniitest

* add a second gorutines for check the newest blocks

* more unittest

* add unittest and run also preCheck on launch

* by default Precheck from FINALIZED and SAFE

* fix unittest, apply PR comments

* changes suggested by ARR552 in integration method

* fix documentation

* import new network-l1-mock from PR#3553

* import new network-l1-mock from PR#3553

* import new network-l1-mock from PR#3553

* import new network-l1-mock from PR#3553

* fix unittest

* fix PR comments

* fix error

* checkReorgAndExecuteReset can't be call with lastEthBlockSynced=nil

* add parentHash to error

* fix error

* merge 3553 fix unittest

* fix unittest

* fix wrong merge

* adapt parallel reorg detection to flow

* fix unit tests

* fix log

* allow use sync parallel mode

---------

Co-authored-by: Alonso <[email protected]>

* Fix + remove empty blocks (0xPolygonHermez#3564)

* Fix + remove empty blocks

* unit test

* linter

* Fix/0xPolygonHermez#3565 reorg (0xPolygonHermez#3566)

* fix + logs

* fix loop

* Revert "fix + logs"

This reverts commit 39ced69.

* fix L1InfoRoot when an error happens during the process of the L1 information (0xPolygonHermez#3576)

* fix

* Comments + mock

* avoid error from some L1providers when fromBlock is higher than toBlock

* Revert some changes

* comments

* add L2BlockModulus to L1check

* doc

* fix dbTx = nil

* fix unit tests

* added logs to analyze blocking issue when storing L2 block

* add debug logs for datastreamer

* fix 0xPolygonHermez#3581 synchronizer panic synchronizing from trusted node (0xPolygonHermez#3582)

* synchronized: 0xPolygonHermez#3583  stop sync from l2 after no closed batch (0xPolygonHermez#3584)

* stop processing trusted Node after first open batch

* Update datastream lib to the latest version with additional debug info

* update dslib client interface

* Update the diff

* Fix non-e2e tests

* Update the docker image for the mock L1 network

* Update the diff

* Fix typo in the comment

* Use the Geth v1.13.11 Docker image and update the genesis spec

* Update the diff

---------

Co-authored-by: agnusmor <[email protected]>
Co-authored-by: Thiago Coimbra Lemos <[email protected]>
Co-authored-by: Alonso Rodriguez <[email protected]>
Co-authored-by: tclemos <[email protected]>
Co-authored-by: Joan Esteban <[email protected]>
Co-authored-by: Alonso <[email protected]>
Co-authored-by: agnusmor <[email protected]>
Co-authored-by: dPunisher <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment