Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: reduce memory held by deferred objects #96

Merged
merged 3 commits into from
Apr 20, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 17 additions & 5 deletions deferred.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,15 @@ import (
"errors"
"fmt"
"io"
"sync"
)

var deferredBufferPool = sync.Pool{
New: func() any {
return bytes.NewBuffer(nil)
},
}

type Deferred struct {
Raw []byte
}
Expand All @@ -24,10 +31,12 @@ func (d *Deferred) MarshalCBOR(w io.Writer) error {
}

func (d *Deferred) UnmarshalCBOR(br io.Reader) (err error) {
// Reuse any existing buffers.
reusedBuf := d.Raw[:0]
d.Raw = nil
buf := bytes.NewBuffer(reusedBuf)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking for understanding:

In the old code reusedBuf was overallocated compared to the data in d.Raw, i.e. capacity was greater than length. When we assign reusedBuf to the slice of Raw the over capacity is carried along with it. Then when we pass reusedBuf to bytes.NewBuffer it further overallocates more capacity than is needed by reusedBuf, amplifying the allocation further.

Is this correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly the second part. In the case we care about, reusedBuf was always empty. But bytes.Buffer generally over-allocates for better performance.

buf := deferredBufferPool.Get().(*bytes.Buffer)

defer func() {
buf.Reset()
deferredBufferPool.Put(buf)
}()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WriteMajorTypeHeaderBuf taking buf seem to know precisely how many bytes are getting written. Ideally, we should grow the buffer before writing considering that function is called in a loop.

But doing that needs a bit more refactoring which seems to be beyond the scope of this PR. Please feel free to address this in a separate PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytes.Buffer will expand by, IIRC, at least 512 bytes each time, so it shouldn't really matter.


// Allocate some scratch space.
scratch := make([]byte, maxHeaderSize)
Expand Down Expand Up @@ -90,6 +99,9 @@ func (d *Deferred) UnmarshalCBOR(br io.Reader) (err error) {
return fmt.Errorf("unhandled deferred cbor type: %d", maj)
}
}
d.Raw = buf.Bytes()
// Reuse existing buffer. Also, copy to "give back" the allocation in the byte buffer (which
// is likely significant).
d.Raw = d.Raw[:0]
d.Raw = append(d.Raw, buf.Bytes()...)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking for understanding:

In the old code the assignment of buf.Bytes() to d.Raw brought along the overallocated capacity. Now by assigning d.Raw to the append call this is no longer happening. d.Raw keeps its old capacity and the append call will only increase its capacity if the length of buf.Bytes() exceeds capacity.

Is this correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no. I'm keeping any overallocation from Raw (reusing the buffer) but removing any overallocation from the bytes Buffer by copying.

Note: in most cases, d.Raw will be nil. It won't be nil in some special cases where, e.g., we're re-using the same one in a loop. In that case, we don't care about a bit of overallocation as we're re-using space.

The issue with bytes.Buffer is that we're over-allocating entirely new space.

return nil
}