Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery Storage: Add more in-depth system tests #8992

Merged
merged 14 commits into from
Aug 21, 2019

Conversation

plamut
Copy link
Contributor

@plamut plamut commented Aug 8, 2019

Closes #8983.

This PR adds additional system tests for BigQuery Storage. It does not include the last item from the issue description, i.e. the long-running test, at least not until decided otherwise.

Things to mind

  • One of the tests depends on the fix in BigQuery: Fix schema recognition of struct field types #9001 to pass. Tests need to be re-run once that is merged.
  • (see the comment below) There is currently an inconsistent type returned for datetime columns (string instead of a datetime instance), affecting the same test.
    Created #9066 for that.

@plamut plamut added the api: bigquerystorage Issues related to the BigQuery Storage API. label Aug 8, 2019
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Aug 8, 2019
# Compare datetime separately, AVRO and PYARROW return different object types,
# although they should both represent the same value.
expected_pattern = re.compile(r"2005-10-26( |T)19:49:41")
assert expected_pattern.match(str(rows[0]["datetime_field"]))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sanity check - is this expected, or should both AVRO and PYARROW return the exact same type?

Right now one returns a plain ISO datetime string, while the other returns a Timestamp object..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answer:
"Expected" is a datetime.datetime object with no timezone. Actual is a string. That's an open issue.

@plamut
Copy link
Contributor Author

plamut commented Aug 9, 2019

FWIW, a system test failure with STRUCT records will be fixed in #9001.

@plamut plamut force-pushed the iss-8983 branch 2 times, most recently from e7b1ed2 to c5f5867 Compare August 12, 2019 12:59
@plamut plamut marked this pull request as ready for review August 12, 2019 15:34
@plamut plamut requested a review from a team August 12, 2019 15:34
@tseaver
Copy link
Contributor

tseaver commented Aug 12, 2019

@plamut System tests are failing in setup with the error, Field person_struct_field is type RECORD but has no schema.

@plamut
Copy link
Contributor Author

plamut commented Aug 12, 2019

@tseaver Indeed, that's the bug that will be fixed with #9001

Copy link
Contributor

@tswast tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! A few questions.

@plamut plamut added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 14, 2019
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Aug 14, 2019
@plamut plamut added the needs work This is a pull request that needs a little love. label Aug 14, 2019
plamut added 3 commits August 15, 2019 11:26
A similar method is planned to be added to the library itself, and when
done, the _add_rows() will not be needed anymore.
Creating a client once per system tests session avoids the overhead
of authenticating before each test case.
Creating a client just once avoids the auth overhead on every system
test case.
@plamut plamut requested a review from tswast August 15, 2019 11:06
@plamut plamut removed the needs work This is a pull request that needs a little love. label Aug 15, 2019
@tswast
Copy link
Contributor

tswast commented Aug 21, 2019

I've filed https://github.com/googleapis/google-cloud-python/issues/9066 for the Avro DATETIME issue.

@plamut plamut merged commit dec9f69 into googleapis:master Aug 21, 2019
@plamut plamut deleted the iss-8983 branch August 21, 2019 07:42
HemangChothani pushed a commit to HemangChothani/google-cloud-python that referenced this pull request Aug 29, 2019
* Add additional BQ storage system test fixtures

* Add reader column selection system test

* Add basic reader system test

* Add reader with row filter system test

* Add reading data with snapshot system test

* Add reading column partitioned table system test

* Add system test for column types data conversions

* Add ingestion time partitioned table system test

* Add system test for resuming a read at an offset

* Remove unnecessary protobuf install in noxfile

* Add TODO comment to replace a test helper method

A similar method is planned to be added to the library itself, and when
done, the _add_rows() will not be needed anymore.

* Extract BQ client to session fixture in tests

Creating a client once per system tests session avoids the overhead
of authenticating before each test case.

* Only create BQ storage client once per test run

Creating a client just once avoids the auth overhead on every system
test case.

* Add common credentials fixture for system tests
emar-kar pushed a commit to MaxxleLLC/google-cloud-python that referenced this pull request Sep 18, 2019
* Add additional BQ storage system test fixtures

* Add reader column selection system test

* Add basic reader system test

* Add reader with row filter system test

* Add reading data with snapshot system test

* Add reading column partitioned table system test

* Add system test for column types data conversions

* Add ingestion time partitioned table system test

* Add system test for resuming a read at an offset

* Remove unnecessary protobuf install in noxfile

* Add TODO comment to replace a test helper method

A similar method is planned to be added to the library itself, and when
done, the _add_rows() will not be needed anymore.

* Extract BQ client to session fixture in tests

Creating a client once per system tests session avoids the overhead
of authenticating before each test case.

* Only create BQ storage client once per test run

Creating a client just once avoids the auth overhead on every system
test case.

* Add common credentials fixture for system tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquerystorage Issues related to the BigQuery Storage API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BigQuery Storage: Add more in-depth system tests covering all data formats and field data types
5 participants