Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAT: Implement assert that all fields in all streams contained at least one data point #8272

Conversation

htrueman
Copy link
Contributor

What

Closes #7967

How

Updated TestBasicRead.test_read() with _validate_field_appears_at_least_once test.

Recommended reading order

  1. airbyte-integrations/bases/source-acceptance-test/source_acceptance_test/tests/test_core.py

🚨 User Impact 🚨

That would require us to fix SAT in all (or almost all) connectors. I think it may be done in the follow up PR.

@htrueman htrueman temporarily deployed to more-secrets November 29, 2021 07:07 Inactive
@sherifnada
Copy link
Contributor

tactical suggestion: to avoid breaking all connector builds, we should (temporarily) make this an opt-in flag rather than turned on by default, then create an epic to make it available for all connectors. once a critical mass of connectors is leveraging this option, we should make it turned on by default. WDYT?

Copy link
Contributor

@avida avida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please reuse existing codebase and change testcases to have easy to understand data representing the case. Also please add cases with schema oneOf keyword, schemas with nested fields and arrays

Update unit tests for _validate_field_appears_at_least_once.
@htrueman htrueman requested a review from avida November 29, 2021 16:02
@htrueman htrueman temporarily deployed to more-secrets November 29, 2021 16:03 Inactive
@htrueman htrueman temporarily deployed to more-secrets November 29, 2021 16:24 Inactive
Copy link
Contributor

@avida avida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job, please add some oneOf test cases

…neOf/anyOf choices.

Add oneOf/anyOf unit tests.
…ssert-that-all-fields-in-all-streams-contained-at-least-one-data-point
@htrueman htrueman temporarily deployed to more-secrets November 30, 2021 18:01 Inactive
Update CHANGELOG.md
@htrueman htrueman temporarily deployed to more-secrets November 30, 2021 18:05 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 30, 2021 18:07 Inactive
@htrueman
Copy link
Contributor Author

htrueman commented Nov 30, 2021

/publish connector=bases/source-acceptance-test

🕑 bases/source-acceptance-test https://github.com/airbytehq/airbyte/actions/runs/1522297022
✅ bases/source-acceptance-test https://github.com/airbytehq/airbyte/actions/runs/1522297022

@htrueman htrueman temporarily deployed to more-secrets November 30, 2021 18:15 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 30, 2021 18:16 Inactive
@htrueman htrueman merged commit b265b6c into master Nov 30, 2021
@htrueman htrueman deleted the htrueman/7967-sat-assert-that-all-fields-in-all-streams-contained-at-least-one-data-point branch November 30, 2021 18:22
schlattk pushed a commit to schlattk/airbyte that referenced this pull request Jan 4, 2022
…st one data point (airbytehq#8272)

* Implement vlidatation if each field in a stream has appeared it least once in some record.

* Add unit tests for `_validate_empty_streams` TestBasicRead method.

* Add validate_data_points basic read input option.

* Update `_validate_field_appears_at_least_once_in_stream` to support oneOf/anyOf choices.
Add oneOf/anyOf unit tests.

* Bump docker version.
Update CHANGELOG.md

* Fix test_core.py imports.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SAT: assert that all fields in all streams contained at least one data point
5 participants