-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: statesync is failing without any notable errors #23740
Comments
valid data has all data $ iaviewer data ./application.db "s/k:opchild/"
Got version: 10000
Printing all keys with hashed values (to detect diff)
11
45409A4B7A5B60E9582472D7769946C180740A9D5DC7C5F52921884A7E0C67AC
12
CD2662154E6D76B2B2B92E70C0CAC3CCF534F9B74EB5B89819EC509083D00A50
13
A6CBBC2BC6224EFF02FE0D71EA8484A44EBC879823081DA85278B1188210DC0E
14
CD2662154E6D76B2B2B92E70C0CAC3CCF534F9B74EB5B89819EC509083D00A50
218000000000000001
1298A7E687BE20D63B7081F28E7DE8B3FFBF7CC2872C8017301D6398C4F26261
218000000000000002
506336618FCC1A0020760E56AF128E3C3AFF680C2CFBAC019ABBFE470E8CE091
218000000000000003
...
D0FA9A9366A124821E399998A91759C7E136E5A5BCEB6D89DF5215EAA33451BE
31EA902A3049E97C6704D1EE92663CB270EA57133C
52CC849FFFCB8AEAF2E8AFF5A4A3CA37052E4F90002902F20788C9F7E1B0E202
32EA902A3049E97C6704D1EE92663CB270EA57133C
B1202C98FF7904896DBBBB771D2AA919AA5FC5C23AD5FD4A52B0019393A1D8A2
33EA902A3049E97C6704D1EE92663CB270EA57133C
FBBDE3B3D477DE1904E00C7C2B168E2F5CCEA33478F9D0605252A4673EAFC7D9
Hash: 762AE8C1057AFB65240C284E0BC4CEA621C16DEAA7E00219DB63CF572855F2DB
Size: 2717 but wrong data has only Hash and Size without data entries $ iaviewer data ./application.db "s/k:opchild/"
Got version: 10000
Printing all keys with hashed values (to detect diff)
Hash: 762AE8C1057AFB65240C284E0BC4CEA621C16DEAA7E00219DB63CF572855F2DB
Size: 2717 also checked |
I suspect that db write is not persisted somehow when program quit. |
This problem is sometimes still there, even I try to recover from the snapshot data with btw I also suspect some timing issue anyway at snapshot recover. |
try to make sure that the |
after seeing your comment, now I'm suspecting this part. seems it is waiting only |
Hey @yihuang you are right, importer was not waiting the batch close. This commit fixed the issue, I will create iavl issue. |
resolved with iavl pr merge. |
Is there an existing issue for this?
What happened?
I'm running quite empty chain on eks, and trying to initiate the new rpc nodes using statesync.
but sometimes (in a high posibility) it is failing to sync without errors even the statesync was successful.
Interest thing is same statesync is successful in other replica.
In failed node, when I try the below commands multiple times (sometimes failing), it makes syncing works again.
In succeed replica, when I try the below commands, it makes the succeed replica to be broken with same reason.
It means installed snapshots are not problem, but something related applying snapshot problem.
Cosmos SDK Version
0.50
IAVL Version (using replace)
v1.2.4
How to reproduce?
Use [email protected]
Try multiple times will lead you met this error.
The text was updated successfully, but these errors were encountered: