Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct the coefficient value of timestamp nanoseconds when writing to Ion binary #177

Merged
merged 10 commits into from
Jun 3, 2021

Conversation

byronlin13
Copy link
Contributor

@byronlin13 byronlin13 commented May 28, 2021

Issue #, if available:

This PR is to address digest mismatch error when passing various timestamp objects into QLDB driver for Go.

Timestamps tested:

  • timestamp := time.Now()
  • timestamp := time.Now().UTC()
  •   millis := int64(1608218715340)
      t := time.Unix(0, millis*1000000)
      timeStamp := ion.NewTimestampWithFractionalSeconds(t, ion.TimestampPrecisionNanosecond, ion.TimezoneUTC, 3)
    
  • timestamp := ion.NewTimestampWithFractionalSeconds(time.Now().UTC(), ion.TimestampPrecisionNanosecond, ion.TimezoneUTC, 9)
  • timestamp := ion.NewTimestampWithFractionalSeconds(time.Now().UTC(), ion.TimestampPrecisionSecond, ion.TimezoneUTC, 9)
  • timestamp, _ := ion.ParseTimestamp(t.Format("2006-01-02T15:04:05.000000Z"))
  • timestamp := ion.NewTimestamp(t, ion.TimestampPrecisionNanosecond, ion.TimezoneUTC)
  • timestamp := time.Date(2021, time.May, 25, 13, 41, 31, 90000, time.UTC)

Description of changes:
Writing a timestamp nanosecond into Ion binary will write bytes of two values: Coefficient and exponent. These two values are used to calculate the nanoseconds using the equation: coefficient * 10 ^ -exponent.

For example a timestamp: 2021-05-27 00:19:05.000050000 has coefficient: 50000 and exponent: 9.

Inside encodeTimeDate function, we get the correct coefficient by calling time.Nanosecond which returns 50000 however the exponent is calculated by converting the coefficient to string and getting the length which returns 5 and is incorrect. The fix is to set the exponent to 9 if there are nanoseconds because time.time will always contain 9 precision units.

The next step is to read the binary ion timestamps correctly. Original implementation was to look at the coefficient value and remove trailing zeros but the binary ion already contains a byte containing the exponent so we can just use that.

Has gone through internal BQ review with comments in a PR to my fork.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Byron Lin added 6 commits May 27, 2021 00:38
Writing a timestamp nanosecond into Ion binary will write bytes of two values: Coefficient and exponent. These two values are used to calculate the nanoseconds using the equation: coefficient * 10 ^ -exponent.

For example a timestamp:  `2021-05-27 00:19:05.000050000` has coefficient: `50000` and exponent: `9`.

Inside marshal binary function, we get the correct coefficient by calling `time.Nanosecond` which returns `50000` however the exponent is calculated by converting the coefficient to string and getting the length which returns `5` and is incorrect.

We can set the exponent to `9` if there are nanoseconds because it will always be `9` precision units.

The next step is to read the binary ion timestamps correctly. Original implementation was to look at the cofficient value and remove trailing zeros but the binary ion already contains a byte containing the exponent so we can just use that.
* Update doc string comments to be more focused on the function at hand
* Removed redundant logic in TruncateNanosecond()
* Update index to timeZoneIndex
* Update numTrailingZeros to numZerosNeeded
@codecov-commenter
Copy link

codecov-commenter commented May 28, 2021

Codecov Report

Merging #177 (b3422b0) into master (4e5ade8) will decrease coverage by 0.05%.
The diff coverage is 87.17%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #177      +/-   ##
==========================================
- Coverage   74.67%   74.61%   -0.06%     
==========================================
  Files          25       25              
  Lines        4928     4925       -3     
==========================================
- Hits         3680     3675       -5     
- Misses        743      744       +1     
- Partials      505      506       +1     
Impacted Files Coverage Δ
ion/timestamp.go 67.09% <86.66%> (-0.32%) ⬇️
ion/bitstream.go 79.49% <87.50%> (+0.02%) ⬆️
ion/marshal.go 70.40% <100.00%> (-0.60%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4e5ade8...b3422b0. Read the comment docs.

ion/timestamp.go Outdated
// Specify trailing zeros if fractional precision is less than the nanoseconds.
// e.g. A timestamp: 2021-05-25T13:41:31.00001234 with fractional precision: 2 will return "2021-05-25 13:41:31.00"
ns := ts.dateTime.Nanosecond()
if ts.dateTime.Nanosecond() == 0 || maxFractionalPrecision-len(strconv.Itoa(ns)) >= int(ts.numFractionalSeconds) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if ns == 0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, done.

ion/timestamp.go Outdated
zeros.WriteByte('.')
numZerosNeeded = int(ts.numFractionalSeconds)
} else {
dotIndex := strings.LastIndex(format, ".")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decimalPlaceIndex

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

ion/timestamp.go Outdated
} else {
dotIndex := strings.LastIndex(format, ".")
if dotIndex != -1 {
numZerosNeeded = int(ts.numFractionalSeconds) - (timeZoneIndex - dotIndex) + 1
Copy link
Contributor

@justing-bq justing-bq May 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

decimalPlaceIndex := strings.LastIndex(format, ".")
if decimalPlaceIndex != -1 {
	decimalPlacesOccupied := timeZoneIndex - decimalPlaceIndex - 1
	numZerosNeeded = int(ts.numFractionalSeconds) - decimalPlacesOccupied
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

ion/timestamp.go Outdated
@@ -159,30 +161,16 @@ func NewTimestampFromStr(dateStr string, precision TimestampPrecision, kind Time
if precision >= TimestampPrecisionNanosecond {
pointIdx := strings.LastIndex(dateStr, ".")
if pointIdx != -1 {
nonZeroFraction := false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: let's get rid of this blank line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Byron Lin added 2 commits May 31, 2021 11:42
* Remove blank line
* Reuse a variable instead of defining it
*Rename dotIndex to decimalPlaceIndex
*Create new variable: decimalPlacesOccupied to make logic more clear
dateTime: dateTime,
precision: tt.fields.precision,
kind: kind,
numFractionalSeconds: tt.fields.numFractionalSeconds,
Copy link
Contributor

@zslayton zslayton Jun 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I missing something, or should this field have a name like numFractionalSecondsDecimalPlaces? (i.e. It's a number of decimal places, not a number of fractional seconds.)

EDIT: To be clear: I'm not asking you to rename it as I think that would be a breaking change at this point. I'm asking you to confirm my understanding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct.

  • timestamp, err := ParseTimestamp("2006-01-02T15:04:05.123Z") will have timestamp.numFractionalSeconds = 3
  • timestamp, err := ParseTimestamp("2006-01-02T15:04:05.123000Z") will have timestamp.numFractionalSeconds = 6

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying. I find this naming confusing, but it's part of the public API so we can't change it without a major version bump.

ion/marshal.go Outdated
@@ -446,8 +445,9 @@ func (m *Encoder) encodeTimeDate(v reflect.Value) error {
// Get number of fractional seconds precisions
ns := t.Nanosecond()
numFractionalSeconds := 0
// Go native time object will always have 9 precision units when nanoseconds > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the precision change when nanoseconds > 0? Why isn't it a 0 of the same precision?

Copy link
Contributor Author

@byronlin13 byronlin13 Jun 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For time.time objects, if the nanosecond is 0 or not specified, it is treated as no nanoseconds.

timeObject, err := time.Parse(time.RFC3339, "2000-01-01T01:01:01.00Z") will have timeObject.nanoseconds = 0

  • The string representation of this object is: 2000-01-01 01:01:01 +0000 UTC

timeObject, err := time.Parse(time.RFC3339, "2000-01-01T01:01:01.01Z") will have timeObject.nanoseconds = 10000000

  • The string representation of this object is: 2000-01-01 01:01:01.01 +0000 UTC

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Time type documentation says:

A Time represents an instant in time with nanosecond precision.

Unlike Ion's Timestamp type, which supports variable precision, it seems like time.Time always has nanosecond precision. If that's true, I would expect a time.Time with zero nanoseconds to be written out as a Timestamp with a fractional seconds precision of 9 decimal places.

For time.time objects, if the nanosecond is 0 or not specified, it is treated as no nanoseconds.

Could you please point me to the relevant docs for this? How do we distinguish between 0 or not specified in a time.Time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, I was basing my interpretation on the fact that time.nanoseconds = 0 and the string representation was not showing fractional seconds 🤦 .

I removed the nanoseconds > 0 check in encodeTimeDate().

I also added guards to NewDateTimestamp(), NewTimestamp() and set fractionalPrecision = 9 if the timestamp precision is TimestampPrecisionNanosecond

Copy link
Contributor

@tgregg tgregg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, one comment and one question below.

Byron Lin added 2 commits June 2, 2021 16:23
Previously we were collecting the timestamp precision (aka exponent) by peeking into a byte however the spec states timestamp precision is a VarInt.  Updated the logic to retrieve the precision by parsing the VarInt within readNsecs() -> readDecimal().
…isionNanosecond

When we create a timestamp object from Golang's time object, we were specifying the precision = 9 if nanoseconds > 0. This is incorrect because according to Go time docs, time object represents an instant in time with nanosecond precision.

The change is to set fractional precision = 9 when creating a timestamp object from Go time object.

Another change added was to add a guard to timestamp creation functions and set fractional precision = 9 if timestamp precision =  TimestampPrecisionNanosecond. Otherwise set it to 0.
@byronlin13 byronlin13 requested review from zslayton and tgregg June 3, 2021 17:07
}

exponent := uint8(d.scale)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment explaining that this cast should be safe because the max precision of 9 can fit inside uint8.

Copy link
Contributor Author

@byronlin13 byronlin13 Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not true because the Ion timestamp spec mentions:

Fractional seconds are allowed, with at least one digit of precision and an unlimited maximum.

Other Ion implementations can support arbitrary precision so Ion binary generated from Ion Java can have precision reduced.

Currently Ion Go support timestamps up to 9 precision units.

I created a ticket to update timestamps to support unlimited precision.

@@ -186,7 +186,7 @@ func TestWriteBinaryTimestamp(t *testing.T) {
nowish, _ := NewTimestampFromStr("2019-08-04T18:15:43.863494+10:00", TimestampPrecisionNanosecond, TimezoneLocal)

testBinaryWriter(t, eval, func(w Writer) {
assert.NoError(t, w.WriteTimestamp(NewTimestamp(time.Time{}, TimestampPrecisionNanosecond, TimezoneUTC)))
assert.NoError(t, w.WriteTimestamp(NewTimestamp(time.Time{}, TimestampPrecisionSecond, TimezoneUTC)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change?

Copy link
Contributor Author

@byronlin13 byronlin13 Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the test binary data doesn't have nanoseconds. It was failing because calling NewTimestamp with TimestampPrecisionNanosecond will set fractional precision: 9.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants