Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node, p2p/simulations: fix node.Node AccountsManager leak #19004

Merged
merged 8 commits into from
Feb 7, 2019

Conversation

janos
Copy link
Contributor

@janos janos commented Feb 6, 2019

This PR is a second take on fixing the AccountsManager leak on node.Node. The first PR was rejected #18505.

Fixes: ethersphere/swarm#1142.

Swarm PR: ethersphere/swarm#1194.

While performing the test described in ethersphere/swarm#1142 issue, a large number of gorutines were created and not terminated. Stack trace shows many goroutines:

goroutine 3370 [select]:
github.com/ethereum/go-ethereum/accounts.(*Manager).update(0xc0000b7520)
	/Users/janos/go/src/github.com/ethereum/go-ethereum/accounts/manager.go:95 +0x22a
created by github.com/ethereum/go-ethereum/accounts.NewManager
	/Users/janos/go/src/github.com/ethereum/go-ethereum/accounts/manager.go:68 +0x6f2

goroutine 4481 [select]:
github.com/ethereum/go-ethereum/accounts/keystore.(*watcher).loop(0xc0035ac9a0)
	/Users/janos/go/src/github.com/ethereum/go-ethereum/accounts/keystore/watch.go:94 +0x65b
created by github.com/ethereum/go-ethereum/accounts/keystore.(*watcher).start
	/Users/janos/go/src/github.com/ethereum/go-ethereum/accounts/keystore/watch.go:52 +0xb6

Accounts manager is created but not closed for node.Node, leaving large number of goroutines alive. This change closes accounts manager on new method Node.Close, moves the cleaning of ephemeral directory from Stop to Close, and handles the closing of the node in p2p/simulations.Network through p2p/simulations/adapters.SimNode.

// Close releases resources acquired in Node constructor New.
func (n *Node) Close() error {
return n.accman.Close()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add a shutdown attempt to make sure everything is cleaned up if someone only calls Close, but not Stop.

func (n *Node) Close() error {
	if err := n.Stop(); err != nil && err != ErrNodeStopped {
		return err
	}
	return n.accman.Close()
}

If you want to be truly correct, you could close both first, and only then return the errors, possible returning a combined error if both failed.

func (n *Node) Close() error {
	var errs []error

	// Terminate all subsystems and collect any errors
	if err := n.Stop(); err != nil && err != ErrNodeStopped {
		errs = append(errs, err)
	}
	if err := n.accman.Close(); err != nil {
		errs = append(errs, err)
	}
	// Report any errors that might have occurred
	switch len(errs) {
		case 0: return nil
		case 1: return errs[0]
		default: return fmt.Errorf("%v", errs)
	}
}

@karalabe
Copy link
Member

karalabe commented Feb 6, 2019

We also have quite a few node := makeFullNode(ctx) invocations in cmd/geth that create nodes. To make everything correct, let's also add a defer node.Close() after all those code calls.

@janos
Copy link
Contributor Author

janos commented Feb 6, 2019

Thanks @karalabe for review. I've updated this PR based on your comments.

Copy link
Member

@karalabe karalabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@karalabe karalabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah wait, could you please go over the tests and add the defer Close-es in the node package too?

@janos
Copy link
Contributor Author

janos commented Feb 6, 2019

Yes, Nodes in node tests are now closed.

Copy link
Member

@karalabe karalabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (I made a few minor code tweaks)

@janos
Copy link
Contributor Author

janos commented Feb 6, 2019

Thanks, @karalabe.

Copy link
Contributor

@frncmx frncmx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@karalabe karalabe merged commit 26aea73 into ethereum:master Feb 7, 2019
@karalabe karalabe added this to the 1.9.0 milestone Feb 7, 2019
@frncmx frncmx deleted the fix-node-accman-leak branch February 8, 2019 12:19
nonsense referenced this pull request in ethersphere/swarm May 8, 2019
* node: close AccountsManager in new Close method

* p2p/simulations, p2p/simulations/adapters: handle node close on shutdown

* node: move node ephemeralKeystore cleanup to stop method

* node: call Stop in Node.Close method

* cmd/geth: close node.Node created with makeFullNode in cli commands

* node: close Node instances in tests

* cmd/geth, node: minor code style fixes

* cmd, console, miner, mobile: proper node Close() termination
JukLee0ira added a commit to JukLee0ira/XDPoSChain that referenced this pull request Nov 25, 2024
JukLee0ira added a commit to JukLee0ira/XDPoSChain that referenced this pull request Nov 25, 2024
JukLee0ira added a commit to JukLee0ira/XDPoSChain that referenced this pull request Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

network/simulation: Goroutine leak (simultaneously alive, limit exceeded)
3 participants