Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove custom binary-conversion functions #5542

Merged
merged 1 commit into from
Feb 18, 2016
Merged

Conversation

joelegasse
Copy link
Contributor

I didn't like the look of one of the utility functions I came across, so I went on a mission to purge its evil ways from our codebase. It was hiding allocations...

During the purge, I came across a few places where we were encoding things in a non-optimal manner, so I cleaned them up some in an attempt to reduce allocations and copying of data.

One of the cleanups was in the TSM WAL code, which probably caused the corrupt file seen in #5455. The old code can be perused to see more detail, but the short version is that a WAL entry with lots of long strings would get truncated, but on a per-key basis. That meant that one key would be shorted space, and the next key would be written in space that should be occupied by the length-prefixed strings... nasty edge case. The fix there was to traverse the entire entry, and all of it's values, and do one large allocation of the exact size required.

I was concerned about the performance impact, although I was optimistic it would improve. Here are my local benchmarks of the changes (only the interesting/significant ones):

benchmark                                    old ns/op      new ns/op      delta
BenchmarkWALSegmentWriter                    1528724        786889         -48.53%
BenchmarkIndirectIndex_UnmarshalBinary       3719743        2847992        -23.44%

benchmark                           old allocs     new allocs     delta
BenchmarkWALSegmentWriter           15019          5010           -66.64%
BenchmarkQueueAppend                11             9              -18.18%

benchmark                           old bytes     new bytes     delta
BenchmarkWALSegmentWriter           496923        224383        -54.85%
BenchmarkQueueAppend                109           94            -13.76%

// we'll be marshaling time objects a lot
panic(err)
}
timeLen = len(binTime)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks odd? What's this for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further down, in (*point).MarshalBinary(), we binary-encode the time.Time objects. And rather than guessing, or making assumptions that the encoded size will never change, I added this to figure out the encoded size on startup, then use that to allocate the required space later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could marshal the time first and use the len() of the resulting data to allocate the final slice, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be better since the time has to be marshal anyways.

@@ -489,13 +489,17 @@ func (e *Engine) writeBlocks(bkt *bolt.Bucket, a [][]byte) error {
// Encode block in the following format:
// tmax int64
// data []byte (snappy compressed)
value := append(u64tob(uint64(tmax)), snappy.Encode(nil, block)...)
buf := make([]byte, 16+snappy.MaxEncodedLen(len(block)))
binary.BigEndian.PutUint64(buf[0:8], uint64(tmin))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this swapped tmin and tmax from how they were previously encoded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. tmin is in buf[0:8] and the old value (tmax followed by block) is in buf[8:]. Then at line 499 (new number), you can see that the key is buf[:8] (the old tmin), and the value is buf[8:] (the old value)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

if err != nil {
return err
}
func (a *indexEntries) WriteTo(w io.Writer) (total int64, err error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jwilder Is this acceptable? Each entry is encoded into the same buffer, and then written one-at-a-time to the passed-in io.Writer. The method also now adheres to io.WriterTo instead of clashing method names with io.Writer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would work. AppendTo doesn't feel like the right name, but I can't think of a better one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've seen AppendX used in both math/big and strconv, but for formatted string output. I'm open to changing the name to something more appropriate, though

Also cleaned up some excess allocations, and other cruft from the code
@jwilder
Copy link
Contributor

jwilder commented Feb 18, 2016

👍

joelegasse added a commit that referenced this pull request Feb 18, 2016
Remove custom binary-conversion functions
@joelegasse joelegasse merged commit b709f32 into master Feb 18, 2016
@joelegasse joelegasse deleted the jl-binary-cleanup branch February 18, 2016 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants