-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove custom binary-conversion functions #5542
Conversation
// we'll be marshaling time objects a lot | ||
panic(err) | ||
} | ||
timeLen = len(binTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks odd? What's this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further down, in (*point).MarshalBinary()
, we binary-encode the time.Time
objects. And rather than guessing, or making assumptions that the encoded size will never change, I added this to figure out the encoded size on startup, then use that to allocate the required space later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could marshal the time first and use the len()
of the resulting data to allocate the final slice, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would be better since the time has to be marshal anyways.
3bbcf62
to
53561bd
Compare
@@ -489,13 +489,17 @@ func (e *Engine) writeBlocks(bkt *bolt.Bucket, a [][]byte) error { | |||
// Encode block in the following format: | |||
// tmax int64 | |||
// data []byte (snappy compressed) | |||
value := append(u64tob(uint64(tmax)), snappy.Encode(nil, block)...) | |||
buf := make([]byte, 16+snappy.MaxEncodedLen(len(block))) | |||
binary.BigEndian.PutUint64(buf[0:8], uint64(tmin)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this swapped tmin
and tmax
from how they were previously encoded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. tmin
is in buf[0:8]
and the old value
(tmax
followed by block
) is in buf[8:]
. Then at line 499 (new number), you can see that the key is buf[:8]
(the old tmin
), and the value is buf[8:]
(the old value
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
53561bd
to
183782f
Compare
if err != nil { | ||
return err | ||
} | ||
func (a *indexEntries) WriteTo(w io.Writer) (total int64, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jwilder Is this acceptable? Each entry is encoded into the same buffer, and then written one-at-a-time to the passed-in io.Writer
. The method also now adheres to io.WriterTo
instead of clashing method names with io.Writer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that would work. AppendTo
doesn't feel like the right name, but I can't think of a better one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've seen AppendX
used in both math/big
and strconv
, but for formatted string output. I'm open to changing the name to something more appropriate, though
183782f
to
c360bea
Compare
c360bea
to
929cc8a
Compare
Also cleaned up some excess allocations, and other cruft from the code
929cc8a
to
dc8ed79
Compare
👍 |
Remove custom binary-conversion functions
I didn't like the look of one of the utility functions I came across, so I went on a mission to purge its evil ways from our codebase. It was hiding allocations...
During the purge, I came across a few places where we were encoding things in a non-optimal manner, so I cleaned them up some in an attempt to reduce allocations and copying of data.
One of the cleanups was in the TSM WAL code, which probably caused the corrupt file seen in #5455. The old code can be perused to see more detail, but the short version is that a WAL entry with lots of long strings would get truncated, but on a per-key basis. That meant that one key would be shorted space, and the next key would be written in space that should be occupied by the length-prefixed strings... nasty edge case. The fix there was to traverse the entire entry, and all of it's values, and do one large allocation of the exact size required.
I was concerned about the performance impact, although I was optimistic it would improve. Here are my local benchmarks of the changes (only the interesting/significant ones):