Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak on full nodes and validators #532

Closed
tkporter opened this issue Oct 16, 2019 · 4 comments
Closed

Memory leak on full nodes and validators #532

tkporter opened this issue Oct 16, 2019 · 4 comments
Assignees
Milestone

Comments

@tkporter
Copy link
Contributor

tkporter commented Oct 16, 2019

Expected Behavior

Memory usage by tx-nodes and validators remains constant

Current Behavior

There seems to be a memory leak that's being seen on validators and tx-nodes (though not as extreme).

Example validator:

Screen Shot 2019-10-16 at 12 02 26 PM

Example tx-node:

Screen Shot 2019-10-16 at 12 02 06 PM

Here's a pprof rendering from a tx-node:

out

I wasn't able to find anything super conclusive, just that levelDB looked suspicious. Planning on running my own testnet with --pprof and seeing if there is anything noticeable

@tkporter tkporter added type:bug Something isn't working investigate labels Oct 16, 2019
@tkporter tkporter self-assigned this Oct 16, 2019
@mattharrop
Copy link

My validator node consistently crashes after a few hours running on an instance with 8gb RAM. Logs from the crash attached.
celocrash.log

@tkporter
Copy link
Contributor Author

Looks like discv5 is the one to blame:

Screen Shot 2019-10-22 at 11 39 38 AM

Screen Shot 2019-10-22 at 11 45 25 AM

(Note it doesn't seem like the sum of all the entries comes out to 631 MB, so there must be some inaccuracy here ^)

@daithi-coombes
Copy link

This issue is very very similar: ethereum/go-ethereum#16859
and should've been fixed in: ethereum/go-ethereum#16880

from @mattharrop logs above, there's plenty of goroutines hanging for 2hrs

 goroutine 1 [chan receive, 160 minutes]:

Also in network logs there's false positives about bad enode urls:

Error in upserting a valenode entry      AnnounceMsg="{Address: 0x132EA61AaE41
A3beE6a2faf45aAE68e2A7f9457F, View: {Round: 0, Sequence: 496530}, IncompleteEnodeURL: enode://c8f8bc50da
1780163761d738a1cbed328df7b82900e9d7ada38fb505fe589af60982b91e9b950a75bf677fabaafdf409fe9d2c0f9ead523946
70ed5f1da7f5eb, Signature: f3d205bd4a570b95671c9afba818d73b27cb62fb21376c7a184354b7f9e4745f3d65d4b8aabee
3621abe115ac099539037949771e4d6497dd72ac6ffe351d83001}" error="old announce message"

(am basing false positive on `error="old announce message")

am thinking dangling requests in discovery are the cause, for now. Will post heapdump as per this instructions: #573 (comment)

@trianglesphere
Copy link
Contributor

Closing because it seems like issue #1197 is the same, except with newer logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants