-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[p2p] Reduce memory and CPU usage of libp2p #3808
Conversation
|
@@ -33,6 +33,7 @@ func (opt DHTConfig) getLibp2pRawOptions() ([]libp2p_dht.Option, error) { | |||
opts = append(opts, dsOption) | |||
} | |||
|
|||
opts = append(opts, libp2p_dht.Concurrency(1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why set concurrency to 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the size of this parameter directly affects the frequency of calls to queryPeer in Discovery's functional features, and queryPeer is the main person responsible for taking up memory after I read kad-dht implementation code. The default value of this parameter is 10, and changing it to 1 is a relatively safe value for the 4G memory size of the Validator node, which I think is 1. Of course, I can also try to increase this value and test the memory performance again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the downside of setting it to 1 v.s. larger number like 10?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the implementation code of libp2p-kad-dht, it is to affect the number of concurrency of queries on the same path, it reduces the performance of queries on the same path, on the same path, it will go to the number of concurrency of N = 1 Peer to find information. In a nutshell: it affects the performance of Discovery, but it is better than directly Disable Discovery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, you can test the performance impact of setting it to 1 vs 10. If there is no problem with 1, we can go with 1 then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There shall not be performance issue with the param change.
Thanks. Why ? could you explain more detail ? I have not test the performance , but I readed the code, it might affect DHT Discovery performance , the code is here https://github.com/libp2p/go-libp2p-kad-dht/blob/7cf0c1e05fe3a69cba5d3747358a25b937fcad89/query.go#L282-L300 var // calculate the maximum number of queries we could be spawning.
// Note: NumWaiting will be updated in spawnQuery
maxNumQueriesToSpawn := alpha - q.queryPeers.NumWaiting()
// termination is triggered on end-of-lookup conditions or starvation of unused peers
// it also returns the peers we should query next for a maximum of `maxNumQueriesToSpawn` peers.
ready, reason, qPeers := q.isReadyToTerminate(pathCtx, maxNumQueriesToSpawn)
if ready {
q.terminate(pathCtx, cancelPath, reason)
}
if q.terminated {
return
}
// try spawning the queries, if there are no available peers to query then we won't spawn them
for _, p := range qPeers {
q.spawnQuery(pathCtx, cause, p, ch)
} |
I mean, the slow discovery will not cause trouble at our usage of discovery. |
sounds good. |
Issue
According to the test and verification in issue https://github.com/harmony-one/harmony-ops-priv/issues/36, the OOM is indeed caused by the Discovery function of libp2p's pubsub, but this feature cannot be simply disable. Otherwise, it will cause missing feature. After reading the code implementation of libp2p's kad-dht(https://github.com/libp2p/go-libp2p-kad-dht/blob/7cf0c1e05fe3a69cba5d3747358a25b937fcad89/query.go#L282-L300 https://github.com/libp2p/go-libp2p-kad-dht/blob/7cf0c1e05fe3a69cba5d3747358a25b937fcad89/query.go#L344-L349), I found that some Option parameters can reduce the frequency of queryPeer calls and reduce the calls related to
go-yamux's Stream
function. After testing, the memory is currently performing very well.Why upgrade kad DHT?
cause 0.12.2 reduced CPU usage by the FullRT DHT client during bulk providing libp2p/go-libp2p-kad-dht#720 maybe this fix can solve hold high CPU usage for stream protocol made by @JackyWYX ?
Test
the node 105.44 Test Without Discovery, running it for a few days, the memory behaved very well:
the node 105.44 Test With concurrency is 1 and upgrade kad-dht, running for 7 hours the memory is smooth: