Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mplex salvage operations, part II #102

Merged
merged 14 commits into from
Mar 3, 2022
Merged

Mplex salvage operations, part II #102

merged 14 commits into from
Mar 3, 2022

Conversation

vyzo
Copy link
Contributor

@vyzo vyzo commented Mar 2, 2022

This is part two of the salvage operations in #99.

We limit the buffers used by mplex to 4k, so that we can cut down our memory precommit to just 36Kb per connection.

@vyzo vyzo requested review from Stebalien and marten-seemann March 2, 2022 09:21
Copy link
Contributor

@marten-seemann marten-seemann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review.

multiplex.go Show resolved Hide resolved
multiplex.go Outdated Show resolved Hide resolved
multiplex.go Outdated Show resolved Hide resolved
multiplex.go Show resolved Hide resolved
multiplex.go Outdated Show resolved Hide resolved
multiplex.go Outdated Show resolved Hide resolved
horrible api; what can you do?
@vyzo
Copy link
Contributor Author

vyzo commented Mar 2, 2022

I pushed an additional commit, make timer channel draining robust.

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that this is going to be really slow.

  1. Can we benchmark it?
  2. Maybe we can lazily try to reserve more memory (on demand)? That is, if we're under pressure, we use the pre-reserved buffers. If we're not under pressure, we temporarily reserve more, then free the larger buffer.

multiplex.go Outdated

if !recvTimeout.Stop() {
select {
case <-recvTimeout.C:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't do what you think it does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, the timeout can fire immediately after this check and then we'll timeout on the read immediately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it has been stopped.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. If we can guarantee that the timer was started before we called stop, we should read and block.
  2. If we can't guarantee that, we should find a way to guarantee that.

Copy link
Contributor Author

@vyzo vyzo Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We cant gurantee that the timer channel has been drained.
  2. We could use a boolean variable indicating whether it drained and decide on that.

side note: i really hate timers, they are a .45 footgun.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, using a variable to track whether the channel has been drained.

multiplex.go Show resolved Hide resolved
@vyzo
Copy link
Contributor Author

vyzo commented Mar 2, 2022

the benchmark test says about 10% from raw performance.
Not necessarily representative, but not discouraging.
We should run a data transfer test over the network as well, but i dont think there will be any difference. We tested the previous iteration with 4 buffers and it was fine.

We could try to get more buffers and return them if unused i guess, we can do that in get/put buffer.

multiplex.go Outdated Show resolved Hide resolved
multiplex.go Show resolved Hide resolved
multiplex.go Outdated
var err error
for i := 0; i < MaxBuffers; i++ {
// up-front reserve memory for the essential buffers (1 input, 1 output + the reader buffer)
if err := mp.memoryManager.ReserveMemory(3*BufferSize, 255); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where's the input/output buffer? I see the reader buffer, but that's it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we get it from the pool and return it.

See getBuffer*/putBuffer*.

multiplex.go Outdated Show resolved Hide resolved
multiplex.go Outdated

if !recvTimeout.Stop() {
select {
case <-recvTimeout.C:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, the timeout can fire immediately after this check and then we'll timeout on the read immediately.

@Stebalien
Copy link
Member

the benchmark test says about 10% from raw performance.

10% single stream? I guess we do have double-buffering so maybe that's helping. But I would consider allocating the memory if we have it.

I expect we'll want to use the same technique for yamux, etc. as well. None of the issues we're running into here are mplex specific.

@Stebalien
Copy link
Member

I want to take a look tomorrow morning, but don't block on my review. It looks mostly correct (modulo the timer issue), but I want to re-assure myself that we're correctly counting buffers.

I'm fine merging this, then testing on the gateways, then maybe consider reserving more memory on-demand if there are any performance issues.

@vyzo
Copy link
Contributor Author

vyzo commented Mar 2, 2022

fixed the timer issue in 152fa80

Let's sleep on it, and take another look tomorrow.
I am mostly feeling fine about it, so let's test it in the real network.

@marten-seemann
Copy link
Contributor

No significant performance impact on a cross-atlantic single-stream data transfer. As we know, this is a totally insufficient way to measure muxer performance, but at least this one metric is not getting worse.

@marten-seemann marten-seemann merged commit b646332 into master Mar 3, 2022
@Stebalien
Copy link
Member

No significant performance impact on a cross-atlantic single-stream data transfer. As we know, this is a totally insufficient way to measure muxer performance, but at least this one metric is not getting worse.

Eh, that's probably good enough for mplex. I was mostly concerned that it would slow to a crawl on any non-loopback connection.

@BigLep BigLep mentioned this pull request Mar 4, 2022
69 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants