Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORDER BY expressions containing aggregation functions are not handled appropriately by the analyzer. #4485

Closed
reltuk opened this issue Sep 8, 2021 · 3 comments
Labels
analyzer bug Something isn't working sql Issue with SQL

Comments

@reltuk
Copy link
Contributor

reltuk commented Sep 8, 2021

The following works because the SUM gets mapped directly to the projected SUM expression (by resolve_columns):

SELECT category, SUM(price) FROM products GROUP BY category ORDER BY SUM(price) ASC

The following does not work, because the SUM(price) (sub)expression currently sticks around in the SortFields of the plan.Sort node, but it cannot evaluate correctly in this context:

SELECT category, SUM(price) FROM products GROUP BY category ORDER BY SUM(price) + 1 ASC

Other things that should generally work but do not:

SELECT category, SUM(price) FROM products GROUP BY category ORDER BY AVG(price) ASC

SELECT category, SUM(price) FROM products GROUP BY category ORDER BY COUNT(*) ASC

SELECT category, SUM(price) FROM products GROUP BY category ORDER BY SUM(price) % 2, SUM(price), AVG(price) ASC

In general, a correct way to handle an aggregation in a sort expression is to push the expression down to the group by node, replace the expression with an appropriately indexed GetField in the sort node itself, and project the expression away in a projection above the Sort node. The logic to do that does not currently exist in the analyzer.

This issue also might apply to window functions, but I have not investigated there yet.

For now, we are going to add a validation step that looks for aggregations outside of GroupBy expressions. If they exist, the query is unsupported and we will return an error.

@zachmu zachmu transferred this issue from dolthub/go-mysql-server Oct 6, 2022
@timsehn timsehn added bug Something isn't working sql Issue with SQL labels Oct 6, 2022
@max-hoffman
Copy link
Contributor

this is still broken as of dolt 0.50.8

@max-hoffman
Copy link
Contributor

repro:

> create table xy (x int primary key, y int);

> select x, sum(y) from xy group by x order by avg(y);
column "AVG(xy.y)" could not be found in any table in scope

> select x, sum(y) from xy group by x order by sum(y)+1;
an aggregation remained in the expression '(SUM(xy.y) + 1)' after analysis, outside of a node capable of evaluating it; this query is currently unsupported.

@timsehn
Copy link
Contributor

timsehn commented Apr 15, 2024

These work now :-)

test_subra/main*> create table xy (x int primary key, y int);
test_subra/main*> select x, sum(y) from xy group by x order by avg(y);
Empty set (0.00 sec)

test_subra/main*> select x, sum(y) from xy group by x order by sum(y)+1;
Empty set (0.00 sec)

test_subra/main*>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analyzer bug Something isn't working sql Issue with SQL
Projects
None yet
Development

No branches or pull requests

4 participants