-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[multistage][test] add table expression tests #9817
Conversation
948c8b6
to
ef54003
Compare
ef54003
to
f24ae62
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, suggest adding two tests
}, | ||
"queries": [ | ||
{ | ||
"sql": "SELECT strCol FROM {tbl1} GROUP BY strCol" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- can we group by multiple columns?
- can we group by a function call?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- there's already a multi column group by but i added more
- also added group by without agg with single/multi columns (distinct basically)
- added function call on group by too. (alias will not be covered as they will be in the alias test cases)
Codecov Report
@@ Coverage Diff @@
## master #9817 +/- ##
=============================================
- Coverage 63.44% 34.88% -28.57%
+ Complexity 5326 199 -5127
=============================================
Files 1960 1972 +12
Lines 105350 105703 +353
Branches 15960 15998 +38
=============================================
- Hits 66840 36872 -29968
- Misses 33589 65707 +32118
+ Partials 4921 3124 -1797
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
] | ||
} | ||
}, | ||
"queries": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also add the following ?
Non-correlated sub-query with HAVING
Without the subquery, we will first have to execute a query to find overall average and hardcode that value in HAVING clause of the second query.
SELECT group1, group2, AVG(col) AS groupAverage
FROM FOO
HAVING groupAverage > (
SELECT AVG(col) FROM FOO)
Similarly correlated sub-query with HAVING
SELECT group1, group2, AVG(col) AS groupAverage
FROM FOO as OUTER
HAVING groupAverage > (
SELECT AVG(col)
FROM FOO
WHERE group2 < OUTER.group2 )
I don't think correlated are supported yet though ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
correct correlate is not supported.
@@ -0,0 +1,93 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High level question on table expression tests?
By table expressions do you mean CTEs ? CTEs create named queries using WITH
afaik.
May be we can call these Subqueries.json
to distinguish from CTEs ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
table expression as defined in postgresql 7.2 section. (e.g. all clauses that alters the behavior of a SQL after choosing the tables)
some of the sections we don't support is not included such as:
{ "sql": "SELECT strCol, intCol FROM {tbl1} GROUP BY strCol, intCol" }, | ||
{ "sql": "SELECT strCol, SUM(intCol) FROM {tbl1} GROUP BY strCol" }, | ||
{ "sql": "SELECT strCol, b.strCol2, (sum(a.intCol) * b.intCol) AS colAlias FROM {tbl1} a INNER JOIN {tbl2} b ON a.strCol = b.strCol1 GROUP BY strCol, b.strCol2, b.intCol" }, | ||
{ "sql": "SELECT MOD(b.intCol, 2), sum(a.intCol) AS colAlias FROM {tbl1} a INNER JOIN {tbl2} b ON a.strCol = b.strCol1 GROUP BY MOD(b.intCol, 2)" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we also ensure that logically equivalent SQL queries return same results ?
For example INNER JOIN query and a corresponding subquery with IN clause should return same results.
SELECT p.product_id, p.product_name
FROM
products p
INNER JOIN
categories c
ON p.category_id = c.category_id
WHERE c.category_id > 25
should return the same results as
SELECT p.product_id, p.product_name
FROM products p
WHERE p.category_id IN (
SELECT c.category_id
FROM categories c
WHERE c.category_id > 25
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both are tested against a reference database. so yeah indirectly this is implied.
{ | ||
"ignored": true, | ||
"comments": "Relation Decorrelator not supported", | ||
"sql": "SELECT * FROM {tbl} AS a WHERE a.strCol IN (SELECT b.strCol FROM {tbl} AS b WHERE b.intCol = a.intCol + 1)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For subquery in WHERE
with IN
, can we also add NOT IN
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For IN
clause subquery, let's also add test that the exact same query that uses ANY
(supported by postgres) instead of IN
returns identical results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same goes for SOME
I guess
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any and some will be tested in the function test instead of the table expression test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added. none are supported but added with ignore flag
{ | ||
"ignored": true, | ||
"comments": "EXISTS not supported", | ||
"sql": "SELECT * FROM {tbl} AS a WHERE EXISTS (SELECT strCol FROM {tbl} AS b WHERE b.intCol = a.intCol + 1)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the same for NOT EXISTS
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
{ "sql": "SELECT MOD(b.intCol, 2), sum(a.intCol) AS colAlias FROM {tbl1} a INNER JOIN {tbl2} b ON a.strCol = b.strCol1 GROUP BY MOD(b.intCol, 2)" }, | ||
{ "sql": "SELECT strCol, SUM(intCol) FROM {tbl1} GROUP BY strCol HAVING SUM(intCol) > 3" }, | ||
{ "sql": "SELECT strCol, SUM(intCol) FROM {tbl1} GROUP BY strCol HAVING AVG(intCol) > 1 AND MIN(intCol) < 10" }, | ||
{ "sql": "SELECT strCol, b.strCol2, (sum(a.intCol) * b.intCol) AS colAlias FROM {tbl1} a INNER JOIN {tbl2} b ON a.strCol = b.strCol1 GROUP BY strCol, b.strCol2, b.intCol HAVING avg(a.intCol) < 100" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also add tests for subquery in the SELECT clause as a scalar operand ?
Something like ....
SELECT
empID, lastName,
firstName, (
CASE WHEN deptID =
(
SELECT deptID
FROM departments
WHERE ......
)
THEN 'blah1'
ELSE 'blah2'
END)
FROM emp
3be0b46
to
73ed469
Compare
} | ||
] | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add select foo from (select 1) as foo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am not sure this is a valid query in most other SQL systems. did you meant
select foo.* from (select 1) as foo
foo
is a table not a column name in this context, and postgres returns a table type which is not supported in Pinot
Thanks for addressing comment. LGTM |
from PostgresQL doc section 7.2.2