bandwidth limiting #172

yutongp · 2019-04-11T04:36:03Z

Hi libp2p engs,

In our project, we use libp2p-pubsub for our p2p layer. Today we did a DoS test, we sent 3000 TPS requests into the network and the all network hang(all 30 nodes stuck).
We investigated our program profile a bit and found out 45% of cpu timing is working on pubsub message verification and that basically consumed all our 4 cores CPU resource for each node in our network. Btw, we launched the attack in one single machine(4 cores also)

I posted a similar issue in go-libp2p-connmgr: libp2p/go-libp2p-connmgr#37.

Since pubsub already has a Blacklist feature, and the issue is caused by msg verfication in pubsub, should pubsub provide ability to block large bandwidth node to flood the network?

Thanks.

The text was updated successfully, but these errors were encountered:

vyzo · 2019-04-11T08:26:33Z

I am a little reluctant to automatically blacklist high bandwidth nodes in the library, this should be done by user decree.

You can also try to tune down the validation throttle, which will start dropping messages that can't be validated because the queue is full.

vyzo · 2019-04-11T08:27:53Z

But in general I am in agreement that we should detect the situation and at least notify the user, which can then take action and potentially blacklist peers.

raulk · 2019-04-11T08:53:35Z

I believe you’re seeing two different ~~interrelated,~~ issues that interplay.

Pubsub validators are assumed to be lightweight.
We don’t do traffic shaping, yet.

For 1, there’s #169 (if you really need to do heavy validation). For 2, we are evolving the connection manager and eventually it’ll become a traffic shaper.

Would you or your team be willing to help us out with any of those? :-)

yutongp · 2019-04-12T22:59:13Z

You can also try to tune down the validation throttle, which will start dropping messages that can't be validated because the queue is full.

We investigated the validation throttle configuration. It doesn't help in our experiment. We used 500 and 10, in both cases our CPUs climbed to 100% quickly under our load test. I think the reason is blew code always run in "full speed"

		select {
		case p.validateThrottle <- struct{}{}:
			go func() {
				p.validate(vals, src, msg)
				<-p.validateThrottle
			}()
		default:
			log.Warningf("message validation throttled; dropping message from %s", src)
		}

No matter how long the throttle queue is, as long as there is task remain in the queue, above code will still process as fast as it can to dequeue and doing validation that will consume all CPUs. The throttle queue basically cut down ocean to lake, but if there are continuous water dropping into the lake, it will still fill the pipe.

In our case, we want to control the velocity of the validation events(incoming requests) to reduce to burden on CPUs. We ended up with using: https://godoc.org/golang.org/x/time/rate as a velocity throttler in our code. It is a alright solution for us for DoS attack, but maybe not for DDoS.

yutongp · 2019-04-13T00:17:22Z

We are also curious if there is any design, plan or discussion for DDoS attack mitigation in libp2p/ipfs community.

vyzo · 2019-04-13T08:01:49Z

We have an open issue for optimizing the validator pipeline to prespawn goroutine workers -- see #103. We probably want to prioritize this now, as it would probably help in this case.

vyzo · 2019-04-25T18:56:39Z

Can you try #176? The validator pipeline has been reworked to avoid running the aforementioned hot code at full speed.

daviddias · 2020-03-25T08:41:41Z

@yutongp the latest version on master includes #176 already. When you have a chance to try it out, can you let us know? Meanwhile closing this issue.

raulk mentioned this issue Apr 11, 2019

Offloading messages for async validation #169

Open

vyzo mentioned this issue Apr 25, 2019

rework validator pipeline #176

Merged

4 tasks

daviddias closed this as completed Mar 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bandwidth limiting #172

bandwidth limiting #172

yutongp commented Apr 11, 2019

vyzo commented Apr 11, 2019

vyzo commented Apr 11, 2019

raulk commented Apr 11, 2019 •

edited

Loading

yutongp commented Apr 12, 2019 •

edited

Loading

yutongp commented Apr 13, 2019

vyzo commented Apr 13, 2019

vyzo commented Apr 25, 2019

daviddias commented Mar 25, 2020

bandwidth limiting #172

bandwidth limiting #172

Comments

yutongp commented Apr 11, 2019

vyzo commented Apr 11, 2019

vyzo commented Apr 11, 2019

raulk commented Apr 11, 2019 • edited Loading

yutongp commented Apr 12, 2019 • edited Loading

yutongp commented Apr 13, 2019

vyzo commented Apr 13, 2019

vyzo commented Apr 25, 2019

daviddias commented Mar 25, 2020

raulk commented Apr 11, 2019 •

edited

Loading

yutongp commented Apr 12, 2019 •

edited

Loading