-
Notifications
You must be signed in to change notification settings - Fork 804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
peer_store: Warn only on reputation crossing the ban threshold #4000
Conversation
Signed-off-by: Alexandru Vasile <[email protected]>
Signed-off-by: Alexandru Vasile <[email protected]>
Signed-off-by: Alexandru Vasile <[email protected]>
That's strange that we receive to many warnings for banned peers. They should have been disconnected in the first place, aren't they? |
Indeed, it seems that the peer is still kept around after being "banned" and disconnected here: polkadot-sdk/substrate/client/network/src/protocol_controller.rs Lines 209 to 213 in bf424fd
From this comment, it seems like we should remove the peer from the peerstore first. Although, the ongoing peer candidates are filtered by their reputation here: polkadot-sdk/substrate/client/network/src/peer_store.rs Lines 298 to 300 in bf424fd
Thinking out loud, it may be possible that |
One reason is that there might be multiple messages in the queues of different consumers, which leads to calling |
Probably makes sense double-checking that we disconnect peers as stated before merging this PR that might hide the issue. |
@lexnv How bad is the situation without this fix? Is it like a hundred of messages is printed once with the same peer id, or they are being printed constantly? (This might clue on the root cause.) |
Coming back to this, I've placed more data in the linked PR:
I think we can safely merge this and rely on #4031 to increase the time a peer remains banned (from 10s to around 69secs) |
May be merging #4031 is enough then? I worry that the change from this PR can mask some reputation/misbehaving/banning issues if merged. |
Yep that makes sense! I've hit merge on: #4031, I also think it will be sufficient here. We could still get a few more warnings from batching the reputation propagation, but generally, I think increasing the ban time should reduce the noise. In the meanwhile, I ll close this PR and re-open if necessary, thanks! |
) This is a tiny PR to increase the time a peer remains banned. A peer is banned when the reputation drops below a threshold. With every second, the peer reputation is exponentially decayed towards zero. For the previous setup: - decaying to zero from (i32::MAX or i32::MIN) would take 948 seconds (15mins 48seconds) - from i32::MIN to escaping the banned threshold would take 10 seconds This means we are decaying reputation a bit too aggressive and misbehaving peers can misbehave again in 10 seconds. Another side effect of this is that we have encountered multiple warnings caused by a few misbehaving peers. In the new setup: - decaying to zero from (i32::MAX or i32::MIN) would take 3544 seconds (59 minutes) - from i32::MIN to escaping the banned threshold would take ~69 seconds This is a followup of: - #4000. ### Testing Done - Created a misbehaving client with [subp2p-explorer](https://github.com/lexnv/subp2p-explorer), the client is banned for approx 69seconds until it is allowed to connect again. cc @paritytech/networking --------- Signed-off-by: Alexandru Vasile <[email protected]>
This PR changes the warning behavior of the
peer_store
component.We have encountered a large number of warnings coming from peer reputation reports.
Previously, warnings were generated on all reports that were below the threshold.
This behavior also generates warnings for correct behavior (positive) reputation changes, if after adding the reputation change, the reputation is still below the threshold.
In this PR, the warning is printed only when crossing the threshold from being not banned to being banned.
cc @paritytech/networking