-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Address performance/execution plan of TPCH query 9 #77
Comments
So I took a look at this and I have two solutions one which I believe always finds all possible inner joins but runs in N^2 and one that works for this case and should work for most others that runs in N LogN where N is the number of plans in the select statement. Is the N^2 runtime and the added petgraph dependency alright? I don't want datafusion to show up on Accidentally quadratic because of me haha. I've attached a brief description of each method incase there are any obvious improvements that can be made. Thorough version
Less thorough version
The less thorough version was able to find the inner join that was missed for query 9 but there is still the possibility to miss inner joins. For example if the tables have the join relations as seen below, the order of the plans after sorting would be (B,F,D,A,C,E,G). Which means that the plan would generate a cross join between B and F as they share no join key pairs, missing the inner join through D.
|
I believe this is now resolved so will close this. Feel free to reopen if I am mistaken. Here is the logical plan for q9 in master:
|
Describe the bug
TPC-H Query 9 consumes a lot of memory and takes ~8s in memory on a 8 core machine on SF=1 in memory (with 100%cpu usage across cores) to execute. That likely has to do with the plan misses a equi-join and turns it into a cross join.
To Reproduce
Run query 9, check plan and execution time.
Expected behavior
The query is expected to finish faster (<1s) and shouldn't need a cross join.
Additional context
N/A
The text was updated successfully, but these errors were encountered: