-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vdk-trino: collect lineage for select/insert and rename table only #756
Conversation
Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected])
projects/vdk-plugins/vdk-trino/src/vdk/plugin/trino/trino_connection.py
Outdated
Show resolved
Hide resolved
projects/vdk-plugins/vdk-trino/src/vdk/plugin/trino/trino_connection.py
Outdated
Show resolved
Hide resolved
I am not sure if you noticed - the CI tests failed (ci/gitlab/gitlab.com -> Click Details) - https://gitlab.com/vmware-analytics/versatile-data-kit/-/jobs/2176218096 |
Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected])
Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. You can add a few more tests about some corner cases.
@tozka Thank you for the review and valuable comments. |
Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected])
Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected])
Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected])
Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected])
Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected])
) * vdk-trino: collect lineage for select/insert and rename table only Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected]) * vdk-trino: collect lineage for select/insert and rename table only Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected]) * vdk-trino: collect lineage for select/insert and rename table only Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected]) * vdk-trino: collect lineage for select/insert and rename table only Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected]) * vdk-trino: collect lineage for select/insert and rename table only Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected]) * vdk-trino: collect lineage for select/insert and rename table only Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected]) * vdk-trino: collect lineage for select/insert and rename table only Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected]) * vdk-trino: collect lineage for select/insert and rename table only Why: To make lineage collecting more production ready, some improvements are needed. What: In order to reduce the load on the query engine, only plans for insert/select queries are calculated. For rename table queries, the plan doesn't give information. The query is parsed and table names extracted. Counting the number of rows in the output table before and after is removed to reduce the burden on the query engine. How has this been tested: Tweaked the test_vdk_trino_lineage.py test to be more comprehensive and cover all scenarios. What type of change are you making? Bug fix (non-breaking change which fixes an issue) or a cosmetic change/minor improvement Signed-off-by: Philip Alexiev ([email protected])
Why:
To make lineage collecting more production ready,
some improvements are needed.
What:
In order to reduce the load on the query engine,
only plans for insert/select queries are calculated.
For rename table queries, the plan doesn't give information.
The query is parsed and table names extracted.
Counting the number of rows in the output table before and after
is removed to reduce the burden on the query engine.
How has this been tested:
Tweaked the test_vdk_trino_lineage.py test
to be more comprehensive and cover all scenarios.
What type of change are you making?
Bug fix (non-breaking change which fixes an issue)
or a cosmetic change/minor improvement
Signed-off-by: Philip Alexiev ([email protected])