Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bandwidth limiting #172

Closed
yutongp opened this issue Apr 11, 2019 · 8 comments
Closed

bandwidth limiting #172

yutongp opened this issue Apr 11, 2019 · 8 comments

Comments

@yutongp
Copy link

yutongp commented Apr 11, 2019

Hi libp2p engs,

In our project, we use libp2p-pubsub for our p2p layer. Today we did a DoS test, we sent 3000 TPS requests into the network and the all network hang(all 30 nodes stuck).
We investigated our program profile a bit and found out 45% of cpu timing is working on pubsub message verification and that basically consumed all our 4 cores CPU resource for each node in our network. Btw, we launched the attack in one single machine(4 cores also)
Screen Shot 2019-04-10 at 9 24 46 PM

I posted a similar issue in go-libp2p-connmgr: libp2p/go-libp2p-connmgr#37.

Since pubsub already has a Blacklist feature, and the issue is caused by msg verfication in pubsub, should pubsub provide ability to block large bandwidth node to flood the network?

Thanks.

@vyzo
Copy link
Collaborator

vyzo commented Apr 11, 2019

I am a little reluctant to automatically blacklist high bandwidth nodes in the library, this should be done by user decree.

You can also try to tune down the validation throttle, which will start dropping messages that can't be validated because the queue is full.

@vyzo
Copy link
Collaborator

vyzo commented Apr 11, 2019

But in general I am in agreement that we should detect the situation and at least notify the user, which can then take action and potentially blacklist peers.

@raulk
Copy link
Member

raulk commented Apr 11, 2019

I believe you’re seeing two different interrelated, issues that interplay.

  1. Pubsub validators are assumed to be lightweight.
  2. We don’t do traffic shaping, yet.

For 1, there’s #169 (if you really need to do heavy validation). For 2, we are evolving the connection manager and eventually it’ll become a traffic shaper.

Would you or your team be willing to help us out with any of those? :-)

@yutongp
Copy link
Author

yutongp commented Apr 12, 2019

You can also try to tune down the validation throttle, which will start dropping messages that can't be validated because the queue is full.

We investigated the validation throttle configuration. It doesn't help in our experiment. We used 500 and 10, in both cases our CPUs climbed to 100% quickly under our load test. I think the reason is blew code always run in "full speed"

		select {
		case p.validateThrottle <- struct{}{}:
			go func() {
				p.validate(vals, src, msg)
				<-p.validateThrottle
			}()
		default:
			log.Warningf("message validation throttled; dropping message from %s", src)
		}

No matter how long the throttle queue is, as long as there is task remain in the queue, above code will still process as fast as it can to dequeue and doing validation that will consume all CPUs. The throttle queue basically cut down ocean to lake, but if there are continuous water dropping into the lake, it will still fill the pipe.

In our case, we want to control the velocity of the validation events(incoming requests) to reduce to burden on CPUs. We ended up with using: https://godoc.org/golang.org/x/time/rate as a velocity throttler in our code. It is a alright solution for us for DoS attack, but maybe not for DDoS.

@yutongp
Copy link
Author

yutongp commented Apr 13, 2019

We are also curious if there is any design, plan or discussion for DDoS attack mitigation in libp2p/ipfs community.

@vyzo
Copy link
Collaborator

vyzo commented Apr 13, 2019

We have an open issue for optimizing the validator pipeline to prespawn goroutine workers -- see #103. We probably want to prioritize this now, as it would probably help in this case.

@vyzo vyzo mentioned this issue Apr 25, 2019
4 tasks
@vyzo
Copy link
Collaborator

vyzo commented Apr 25, 2019

Can you try #176? The validator pipeline has been reworked to avoid running the aforementioned hot code at full speed.

@daviddias
Copy link
Member

@yutongp the latest version on master includes #176 already. When you have a chance to try it out, can you let us know? Meanwhile closing this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants