Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic: leveldb: batch corrupted: invalid records length #1505

Closed
keo opened this issue Jul 21, 2015 · 32 comments
Closed

panic: leveldb: batch corrupted: invalid records length #1505

keo opened this issue Jul 21, 2015 · 32 comments

Comments

@keo
Copy link

keo commented Jul 21, 2015

I upgraded geth yesterday from PPA (to 0.9.39+682SNAPSHOT20150719122250trusty-0ubuntu1)

Then a few hours later I got a panic: leveldb: batch corrupted: invalid records length error.

Here's the log: https://gist.github.com/keo/7a329bfd2ab455a0843a

@keo
Copy link
Author

keo commented Jul 21, 2015

Here's the system info:

OS: Ubuntu 14.04.2
Kernel: Linux ethnode 3.13.0-52-generic #85-Ubuntu SMP Wed Apr 29 16:44:17 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
golang: 2:1.4.2+3trusty-0ubuntu1

@tcoulter
Copy link

Came here to post this. I'm consistently getting this as well, on a daily basis. My full stack trace is here:

https://gist.github.com/tcoulter/811e78455eb5d23172a4

My system:

Debian Wheezy
On branch develop, commit 02c5022
go version go1.4.2 linux/amd64

@tcoulter
Copy link

@LefterisJP this is consistently crashing geth for me. Any chance this can be looked at before release?

@tcoulter
Copy link

Happened again, now with the latest commit on develop (d1d45aa)

https://gist.github.com/tcoulter/66dfb2dfe786b4bf776f

@LefterisJP
Copy link
Contributor

@tcoulter I am not one of the Go developers, I am with the C++ team but I am sure the Go guys are taking a look at this as we speak. I am running geth too but haven't gotten this yet. Will try to reproduce.

Were you mining or just syncing when this occurs?

@karalabe
Copy link
Member

Could you build geth with the race detector enabled (godep go build --race ./cmd/geth) and run it like that (be sure to run ./geth) for a while and report any big DATA RACE logs if you see?

@tcoulter
Copy link

@LefterisJP Whoops, sorry about that.

Mining and syncing, actually. My miner is consistently pinging the geth box, and geth was trying to catch up with the network since it had crashed previously. This happened when geth was caught up as well.

@tcoulter
Copy link

@karalabe Will do this now and run it. Sometimes it takes hours to crash, so will report back once I have anything.

@tcoulter
Copy link

@karalabe This just happened, right after starting it. Is this useful? https://gist.github.com/tcoulter/257e4d00f14b7b193798

@tcoulter
Copy link

For reference, I'm starting geth like this:

#!/bin/bash
geth --unlock=0 --etherbase="0xa94792e09954f15e8867eb544c17af1855726296" --rpc --maxpeers 100 --datadir /eth console

@tcoulter
Copy link

And eth is running via:

./eth.exe -G -F "http://192.168.1.12:8545" -t 3 --farm-recheck 100

@tcoulter
Copy link

Alright, I'm getting a ton of data races (more than 6, maybe - didn't count). Here's the full logs so far: https://gist.github.com/tcoulter/04b5afe44585dfcf245d

@karalabe
Copy link
Member

The miner data race is new to me, shouldn't be anything too serious, but will fix it tomorrow. The other one though seems serious enough. Are you on the latest develop? I've fixed two races yesterday or the day before I believe.

@tcoulter
Copy link

Ya, git log says I'm on d1d45aa, which is the latest commit on develop.

@tcoulter
Copy link

I haven't run into the crash yet, but will post the full log when I do.

@karalabe
Copy link
Member

So, the fix for the last three races is #1511, though it shouldn't affect you or geth in any way. The data that could have been corrupted is never used on that code path. i'll prep the other tomorrow, yet imho that shouldn't be an issue either, but who knows.

@tcoulter
Copy link

So my geth process was eventually killed. I'm not sure by what -- I for sure didn't do it. But here's the full output. Gist wasn't very happy with me pasting in 6.5Mb, so added to mega: http://www.megafileupload.com/hqaW/geth_output.txt

Lots of races in there. Not sure if they're different from any of the others.

@tcoulter
Copy link

A few more races. Can't seem to get geth to synchronize anymore (see logs). Will turn off the miners and try again: https://gist.github.com/tcoulter/c81458cfe25fb38586b5

http://stats.ethdev.net confirms no peers:
screen shot 2015-07-22 at 2 15 49 pm

@tcoulter
Copy link

When compiled with --race, does geth use a lot of memory? It's currently use 53% of 24Gb, and it's continuing to grow.

screen shot 2015-07-22 at 2 26 10 pm

@fjl
Copy link
Contributor

fjl commented Jul 22, 2015

@tcoulter don't production-mine with -race. The race detector, as nice as it is, has very high resource overhead.

@tcoulter
Copy link

Ahh. Thanks.

@tcoulter
Copy link

Looks like the full crash happened again. Log here:

https://gist.github.com/tcoulter/cd80e5ca6d928c0e86a7

@karalabe
Copy link
Member

Yeah, the race detector is really expensive, though I was hoping that the crash might be related to a data race, that's why I suggested you should run with it. We've pushed a small race fix 1-2 days ago, and I'll try to address the other one now. Though I don't think it's the reason for the crashes you are seeing it might be worth a shot.

The downside is that if the database was corrupted some time ago due to a bug/crash, it may already be too late to fix it. It could probably help to see the offending database, but at it's current size that's probably not possible.

@tcoulter
Copy link

I'm happy to zip it up and make it available via my personal server, if you'd like to download it and take a look. If you can advise me what directory you'd need (and how to not send you my private keys) I'd appreciate it.

@karalabe
Copy link
Member

That would be really helpful. I'm not sure exactly what is faulting, so
pack up the blockchain, state and extra folders from your datadir. These
are all public datasets, whilst your keys are in the keystore folder. Make
sure not to include that :)
On Jul 24, 2015 5:51 PM, "Tim Coulter" [email protected] wrote:

I'm happy to zip it up and make it available via my personal server, if
you'd like to download it and take a look. If you can advise me what
directory you'd need (and how to not send you my private keys) I'd
appreciate it.


Reply to this email directly or view it on GitHub
#1505 (comment)
.

@tcoulter
Copy link

@karalabe Backup is 84Gb. I don't actually have that much free space on my personal server. Any thoughts on where I could put it?

@tcoulter
Copy link

Scratch that. Uploading to s3 now.

@fjl
Copy link
Contributor

fjl commented Aug 9, 2015

Any news on that upload? ;)

@tcoulter
Copy link

I still have the backup. Problem is I couldn't find a place to store it.
Got anywhere that'll accept 80Gb?

On Sun, Aug 9, 2015 at 2:54 PM, Felix Lange [email protected]
wrote:

Any news on that upload? ;)


Reply to this email directly or view it on GitHub
#1505 (comment)
.

@keo
Copy link
Author

keo commented Aug 10, 2015

@tcoulter have you tried Bittorent Sync? (https://www.getsync.com/) Worked great for me when needed to share huge files device to device without having to upload it anywhere.

@fjl
Copy link
Contributor

fjl commented Aug 10, 2015

@tcoulter I think we can solve it without this dump. Thank you for going through such hoops to help us debug. Did this issue ever happen again?

@tcoulter
Copy link

It definitely happened again after I first reported it (it happened a
handful of times actually). However, after switching to Frontier I've had
no issues.

Thanks!

On Mon, Aug 10, 2015 at 7:46 AM, Felix Lange [email protected]
wrote:

@tcoulter https://github.com/tcoulter I think we can solve it without
this dump. Thank you for going through such hoops to help us debug. Did
this issue ever happen again?


Reply to this email directly or view it on GitHub
#1505 (comment)
.

@fjl fjl closed this as completed Sep 17, 2015
maoueh pushed a commit to streamingfast/go-ethereum that referenced this issue May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants