You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge?
For measuring the performance improvement of #11827 , some extended queries with more complex udaf(like median, approx_median) + high cardinality group by are needed #12438 .
But I found, such queries can't run successfully to get the result in my local. After debugging, I found it is due to their large intermdiate results which will full memory rapidly, leading to swap or oom...
However, when I run it in a subset with only 15% of the whole clickbench dataset, they can finish successfully and reflect the improvement #11827 (comment)
I think maybe we need a clickbench with the smaller dataset (like tpch 1, tpch 10...) in some situations.
Describe the solution you'd like
Support to generate a samller dataset of the whole clickbench dataset, and we can run queries on it.
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
I would like to troll / 🐟 for improvements: instead of making the benchmark easier, let's spend our time reducing the size of the intermediate state for those aggregates :)
I would like to troll / 🐟 for improvements: instead of making the benchmark easier, let's spend our time reducing the size of the intermediate state for those aggregates :)
🤔 Make sense, solving the real problem may be more valuable.
Is your feature request related to a problem or challenge?
For measuring the performance improvement of #11827 , some extended queries with
more complex udaf(like median, approx_median)
+high cardinality group by
are needed #12438 .But I found, such queries can't run successfully to get the result in my local. After debugging, I found it is due to their large intermdiate results which will full memory rapidly, leading to swap or oom...
However, when I run it in a subset with only 15% of the whole clickbench dataset, they can finish successfully and reflect the improvement #11827 (comment)
I think maybe we need a clickbench with the smaller dataset (like tpch 1, tpch 10...) in some situations.
Describe the solution you'd like
Support to generate a samller dataset of the whole clickbench dataset, and we can run queries on it.
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: