-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct the coefficient value of timestamp nanoseconds when writing to Ion binary #177
Correct the coefficient value of timestamp nanoseconds when writing to Ion binary #177
Conversation
Writing a timestamp nanosecond into Ion binary will write bytes of two values: Coefficient and exponent. These two values are used to calculate the nanoseconds using the equation: coefficient * 10 ^ -exponent. For example a timestamp: `2021-05-27 00:19:05.000050000` has coefficient: `50000` and exponent: `9`. Inside marshal binary function, we get the correct coefficient by calling `time.Nanosecond` which returns `50000` however the exponent is calculated by converting the coefficient to string and getting the length which returns `5` and is incorrect. We can set the exponent to `9` if there are nanoseconds because it will always be `9` precision units. The next step is to read the binary ion timestamps correctly. Original implementation was to look at the cofficient value and remove trailing zeros but the binary ion already contains a byte containing the exponent so we can just use that.
* Update doc string comments to be more focused on the function at hand * Removed redundant logic in TruncateNanosecond()
* Update index to timeZoneIndex * Update numTrailingZeros to numZerosNeeded
Codecov Report
@@ Coverage Diff @@
## master #177 +/- ##
==========================================
- Coverage 74.67% 74.61% -0.06%
==========================================
Files 25 25
Lines 4928 4925 -3
==========================================
- Hits 3680 3675 -5
- Misses 743 744 +1
- Partials 505 506 +1
Continue to review full report at Codecov.
|
ion/timestamp.go
Outdated
// Specify trailing zeros if fractional precision is less than the nanoseconds. | ||
// e.g. A timestamp: 2021-05-25T13:41:31.00001234 with fractional precision: 2 will return "2021-05-25 13:41:31.00" | ||
ns := ts.dateTime.Nanosecond() | ||
if ts.dateTime.Nanosecond() == 0 || maxFractionalPrecision-len(strconv.Itoa(ns)) >= int(ts.numFractionalSeconds) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if ns == 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, done.
ion/timestamp.go
Outdated
zeros.WriteByte('.') | ||
numZerosNeeded = int(ts.numFractionalSeconds) | ||
} else { | ||
dotIndex := strings.LastIndex(format, ".") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decimalPlaceIndex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
ion/timestamp.go
Outdated
} else { | ||
dotIndex := strings.LastIndex(format, ".") | ||
if dotIndex != -1 { | ||
numZerosNeeded = int(ts.numFractionalSeconds) - (timeZoneIndex - dotIndex) + 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decimalPlaceIndex := strings.LastIndex(format, ".")
if decimalPlaceIndex != -1 {
decimalPlacesOccupied := timeZoneIndex - decimalPlaceIndex - 1
numZerosNeeded = int(ts.numFractionalSeconds) - decimalPlacesOccupied
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
ion/timestamp.go
Outdated
@@ -159,30 +161,16 @@ func NewTimestampFromStr(dateStr string, precision TimestampPrecision, kind Time | |||
if precision >= TimestampPrecisionNanosecond { | |||
pointIdx := strings.LastIndex(dateStr, ".") | |||
if pointIdx != -1 { | |||
nonZeroFraction := false | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: let's get rid of this blank line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
* Remove blank line * Reuse a variable instead of defining it
*Rename dotIndex to decimalPlaceIndex *Create new variable: decimalPlacesOccupied to make logic more clear
dateTime: dateTime, | ||
precision: tt.fields.precision, | ||
kind: kind, | ||
numFractionalSeconds: tt.fields.numFractionalSeconds, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I missing something, or should this field have a name like numFractionalSecondsDecimalPlaces
? (i.e. It's a number of decimal places, not a number of fractional seconds.)
EDIT: To be clear: I'm not asking you to rename it as I think that would be a breaking change at this point. I'm asking you to confirm my understanding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is correct.
timestamp, err := ParseTimestamp("2006-01-02T15:04:05.123Z")
will have timestamp.numFractionalSeconds = 3timestamp, err := ParseTimestamp("2006-01-02T15:04:05.123000Z")
will have timestamp.numFractionalSeconds = 6
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying. I find this naming confusing, but it's part of the public API so we can't change it without a major version bump.
ion/marshal.go
Outdated
@@ -446,8 +445,9 @@ func (m *Encoder) encodeTimeDate(v reflect.Value) error { | |||
// Get number of fractional seconds precisions | |||
ns := t.Nanosecond() | |||
numFractionalSeconds := 0 | |||
// Go native time object will always have 9 precision units when nanoseconds > 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does the precision change when nanoseconds > 0
? Why isn't it a 0 of the same precision?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For time.time
objects, if the nanosecond is 0 or not specified, it is treated as no nanoseconds.
timeObject, err := time.Parse(time.RFC3339, "2000-01-01T01:01:01.00Z")
will have timeObject.nanoseconds = 0
- The string representation of this object is:
2000-01-01 01:01:01 +0000 UTC
timeObject, err := time.Parse(time.RFC3339, "2000-01-01T01:01:01.01Z")
will have timeObject.nanoseconds = 10000000
- The string representation of this object is:
2000-01-01 01:01:01.01 +0000 UTC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Time
type documentation says:
A Time represents an instant in time with nanosecond precision.
Unlike Ion's Timestamp
type, which supports variable precision, it seems like time.Time
always has nanosecond precision. If that's true, I would expect a time.Time
with zero nanoseconds to be written out as a Timestamp
with a fractional seconds precision of 9 decimal places.
For time.time objects, if the nanosecond is 0 or not specified, it is treated as no nanoseconds.
Could you please point me to the relevant docs for this? How do we distinguish between 0
or not specified
in a time.Time
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct, I was basing my interpretation on the fact that time.nanoseconds = 0 and the string representation was not showing fractional seconds 🤦 .
I removed the nanoseconds > 0
check in encodeTimeDate()
.
I also added guards to NewDateTimestamp()
, NewTimestamp()
and set fractionalPrecision = 9
if the timestamp precision is TimestampPrecisionNanosecond
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, one comment and one question below.
Previously we were collecting the timestamp precision (aka exponent) by peeking into a byte however the spec states timestamp precision is a VarInt. Updated the logic to retrieve the precision by parsing the VarInt within readNsecs() -> readDecimal().
…isionNanosecond When we create a timestamp object from Golang's time object, we were specifying the precision = 9 if nanoseconds > 0. This is incorrect because according to Go time docs, time object represents an instant in time with nanosecond precision. The change is to set fractional precision = 9 when creating a timestamp object from Go time object. Another change added was to add a guard to timestamp creation functions and set fractional precision = 9 if timestamp precision = TimestampPrecisionNanosecond. Otherwise set it to 0.
} | ||
|
||
exponent := uint8(d.scale) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a comment explaining that this cast should be safe because the max precision of 9 can fit inside uint8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is not true because the Ion timestamp spec mentions:
Fractional seconds are allowed, with at least one digit of precision and an unlimited maximum.
Other Ion implementations can support arbitrary precision so Ion binary generated from Ion Java can have precision reduced.
Currently Ion Go support timestamps up to 9 precision units.
I created a ticket to update timestamps to support unlimited precision.
@@ -186,7 +186,7 @@ func TestWriteBinaryTimestamp(t *testing.T) { | |||
nowish, _ := NewTimestampFromStr("2019-08-04T18:15:43.863494+10:00", TimestampPrecisionNanosecond, TimezoneLocal) | |||
|
|||
testBinaryWriter(t, eval, func(w Writer) { | |||
assert.NoError(t, w.WriteTimestamp(NewTimestamp(time.Time{}, TimestampPrecisionNanosecond, TimezoneUTC))) | |||
assert.NoError(t, w.WriteTimestamp(NewTimestamp(time.Time{}, TimestampPrecisionSecond, TimezoneUTC))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the test binary data doesn't have nanoseconds. It was failing because calling NewTimestamp with TimestampPrecisionNanosecond
will set fractional precision: 9.
Issue #, if available:
This PR is to address digest mismatch error when passing various timestamp objects into QLDB driver for Go.
Timestamps tested:
timestamp := time.Now()
timestamp := time.Now().UTC()
timestamp := ion.NewTimestampWithFractionalSeconds(time.Now().UTC(), ion.TimestampPrecisionNanosecond, ion.TimezoneUTC, 9)
timestamp := ion.NewTimestampWithFractionalSeconds(time.Now().UTC(), ion.TimestampPrecisionSecond, ion.TimezoneUTC, 9)
timestamp, _ := ion.ParseTimestamp(t.Format("2006-01-02T15:04:05.000000Z"))
timestamp := ion.NewTimestamp(t, ion.TimestampPrecisionNanosecond, ion.TimezoneUTC)
timestamp := time.Date(2021, time.May, 25, 13, 41, 31, 90000, time.UTC)
Description of changes:
Writing a timestamp nanosecond into Ion binary will write bytes of two values: Coefficient and exponent. These two values are used to calculate the nanoseconds using the equation:
coefficient * 10 ^ -exponent
.For example a timestamp:
2021-05-27 00:19:05.000050000
has coefficient:50000
and exponent:9
.Inside
encodeTimeDate
function, we get the correct coefficient by callingtime.Nanosecond
which returns50000
however the exponent is calculated by converting the coefficient to string and getting the length which returns5
and is incorrect. The fix is to set the exponent to9
if there are nanoseconds becausetime.time
will always contain 9 precision units.The next step is to read the binary ion timestamps correctly. Original implementation was to look at the coefficient value and remove trailing zeros but the binary ion already contains a byte containing the exponent so we can just use that.
Has gone through internal BQ review with comments in a PR to my fork.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.