Remove custom binary-conversion functions #5542

joelegasse · 2016-02-04T21:16:20Z

I didn't like the look of one of the utility functions I came across, so I went on a mission to purge its evil ways from our codebase. It was hiding allocations...

During the purge, I came across a few places where we were encoding things in a non-optimal manner, so I cleaned them up some in an attempt to reduce allocations and copying of data.

One of the cleanups was in the TSM WAL code, which probably caused the corrupt file seen in #5455. The old code can be perused to see more detail, but the short version is that a WAL entry with lots of long strings would get truncated, but on a per-key basis. That meant that one key would be shorted space, and the next key would be written in space that should be occupied by the length-prefixed strings... nasty edge case. The fix there was to traverse the entire entry, and all of it's values, and do one large allocation of the exact size required.

I was concerned about the performance impact, although I was optimistic it would improve. Here are my local benchmarks of the changes (only the interesting/significant ones):

benchmark                                    old ns/op      new ns/op      delta
BenchmarkWALSegmentWriter                    1528724        786889         -48.53%
BenchmarkIndirectIndex_UnmarshalBinary       3719743        2847992        -23.44%

benchmark                           old allocs     new allocs     delta
BenchmarkWALSegmentWriter           15019          5010           -66.64%
BenchmarkQueueAppend                11             9              -18.18%

benchmark                           old bytes     new bytes     delta
BenchmarkWALSegmentWriter           496923        224383        -54.85%
BenchmarkQueueAppend                109           94            -13.76%

jwilder · 2016-02-04T21:27:44Z

models/points.go

+		// we'll be marshaling time objects a lot
+		panic(err)
+	}
+	timeLen = len(binTime)


This looks odd? What's this for?

Further down, in (*point).MarshalBinary(), we binary-encode the time.Time objects. And rather than guessing, or making assumptions that the encoded size will never change, I added this to figure out the encoded size on startup, then use that to allocate the required space later.

I could marshal the time first and use the len() of the resulting data to allocate the final slice, though.

I think that would be better since the time has to be marshal anyways.

jwilder · 2016-02-04T21:45:34Z

tsdb/engine/bz1/bz1.go

@@ -489,13 +489,17 @@ func (e *Engine) writeBlocks(bkt *bolt.Bucket, a [][]byte) error {
 			// Encode block in the following format:
 			//   tmax int64
 			//   data []byte (snappy compressed)
-			value := append(u64tob(uint64(tmax)), snappy.Encode(nil, block)...)
+			buf := make([]byte, 16+snappy.MaxEncodedLen(len(block)))
+			binary.BigEndian.PutUint64(buf[0:8], uint64(tmin))


I think this swapped tmin and tmax from how they were previously encoded.

Nope. tmin is in buf[0:8] and the old value (tmax followed by block) is in buf[8:]. Then at line 499 (new number), you can see that the key is buf[:8] (the old tmin), and the value is buf[8:] (the old value)

joelegasse · 2016-02-04T23:06:13Z

tsdb/engine/tsm1/reader.go

-		if err != nil {
-			return err
-		}
+func (a *indexEntries) WriteTo(w io.Writer) (total int64, err error) {


@jwilder Is this acceptable? Each entry is encoded into the same buffer, and then written one-at-a-time to the passed-in io.Writer. The method also now adheres to io.WriterTo instead of clashing method names with io.Writer

I think that would work. AppendTo doesn't feel like the right name, but I can't think of a better one.

I've seen AppendX used in both math/big and strconv, but for formatted string output. I'm open to changing the name to something more appropriate, though

Also cleaned up some excess allocations, and other cruft from the code

jwilder · 2016-02-18T20:43:14Z

👍

Remove custom binary-conversion functions

jwilder reviewed Feb 4, 2016
View reviewed changes

joelegasse force-pushed the jl-binary-cleanup branch from 3bbcf62 to 53561bd Compare February 4, 2016 21:40

jwilder reviewed Feb 4, 2016
View reviewed changes

joelegasse force-pushed the jl-binary-cleanup branch from 53561bd to 183782f Compare February 4, 2016 22:49

joelegasse reviewed Feb 4, 2016
View reviewed changes

joelegasse force-pushed the jl-binary-cleanup branch from 183782f to c360bea Compare February 11, 2016 17:24

jwilder force-pushed the jl-binary-cleanup branch from c360bea to 929cc8a Compare February 18, 2016 00:52

Remove custom binary-conversion functions

dc8ed79

Also cleaned up some excess allocations, and other cruft from the code

joelegasse force-pushed the jl-binary-cleanup branch from 929cc8a to dc8ed79 Compare February 18, 2016 19:02

joelegasse added a commit that referenced this pull request Feb 18, 2016

Merge pull request #5542 from influxdata/jl-binary-cleanup

b709f32

Remove custom binary-conversion functions

joelegasse merged commit b709f32 into master Feb 18, 2016

joelegasse deleted the jl-binary-cleanup branch February 18, 2016 20:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove custom binary-conversion functions #5542

Remove custom binary-conversion functions #5542

joelegasse commented Feb 4, 2016

jwilder Feb 4, 2016

joelegasse Feb 4, 2016

joelegasse Feb 4, 2016

jwilder Feb 4, 2016

jwilder Feb 4, 2016

joelegasse Feb 4, 2016

jwilder Feb 4, 2016

joelegasse Feb 4, 2016

jwilder Feb 4, 2016

joelegasse Feb 4, 2016

jwilder commented Feb 18, 2016

Remove custom binary-conversion functions #5542

Remove custom binary-conversion functions #5542

Conversation

joelegasse commented Feb 4, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jwilder commented Feb 18, 2016