Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datafusion-benchmarks --bin tpch -- ci::serde_q7 fails #5639

Closed
r4ntix opened this issue Mar 19, 2023 · 2 comments
Closed

datafusion-benchmarks --bin tpch -- ci::serde_q7 fails #5639

r4ntix opened this issue Mar 19, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@r4ntix
Copy link
Contributor

r4ntix commented Mar 19, 2023

Describe the bug

run cargo test --package datafusion-benchmarks --bin tpch -- ci::serde_q7 --exact --nocapture failed:

running 1 test

thread 'ci::serde_q7' has overflowed its stack
fatal runtime error: stack overflow
error: test failed, to rerun pass `-p datafusion-benchmarks --bin tpch`

This problem was found in arrow-ballista's benchmarks test: apache/datafusion-ballista#711

To Reproduce

👆

Expected behavior

Additional context

@r4ntix r4ntix added the bug Something isn't working label Mar 19, 2023
@r4ntix
Copy link
Contributor Author

r4ntix commented Mar 19, 2023

the q7 sql of TPC-H:

select
    supp_nation,
    cust_nation,
    l_year,
    sum(volume) as revenue
from
    (
        select
            n1.n_name as supp_nation,
            n2.n_name as cust_nation,
            extract(year from l_shipdate) as l_year,
            l_extendedprice * (1 - l_discount) as volume
        from
            supplier,
            lineitem,
            orders,
            customer,
            nation n1,
            nation n2
        where
                s_suppkey = l_suppkey
          and o_orderkey = l_orderkey
          and c_custkey = o_custkey
          and s_nationkey = n1.n_nationkey
          and c_nationkey = n2.n_nationkey
          and (
                (n1.n_name = 'FRANCE' and n2.n_name = 'GERMANY')
                or (n1.n_name = 'GERMANY' and n2.n_name = 'FRANCE')
            )
          and l_shipdate between date '1995-01-01' and date '1996-12-31'
    ) as shipping
group by
    supp_nation,
    cust_nation,
    l_year
order by
    supp_nation,
    cust_nation,
    l_year;

unoptimized plan:

Sort: shipping.supp_nation ASC NULLS LAST, shipping.cust_nation ASC NULLS LAST, shipping.l_year ASC NULLS LAST
  Projection: shipping.supp_nation, shipping.cust_nation, shipping.l_year, SUM(shipping.volume) AS revenue
    Aggregate: groupBy=[[shipping.supp_nation, shipping.cust_nation, shipping.l_year]], aggr=[[SUM(shipping.volume)]]
      SubqueryAlias: shipping
        Projection: n1.n_name AS supp_nation, n2.n_name AS cust_nation, datepart(Utf8("YEAR"), lineitem.l_shipdate) AS l_year, lineitem.l_extendedprice * (Int64(1) - lineitem.l_discount) AS volume
          Filter: supplier.s_suppkey = lineitem.l_suppkey AND orders.o_orderkey = lineitem.l_orderkey AND customer.c_custkey = orders.o_custkey AND supplier.s_nationkey = n1.n_nationkey AND customer.c_nationkey = n2.n_nationkey AND (n1.n_name = Utf8("FRANCE") AND n2.n_name = Utf8("GERMANY") OR n1.n_name = Utf8("GERMANY") AND n2.n_name = Utf8("FRANCE")) AND lineitem.l_shipdate BETWEEN CAST(Utf8("1995-01-01") AS Date32) AND CAST(Utf8("1996-12-31") AS Date32)
            CrossJoin:
              CrossJoin:
                CrossJoin:
                  CrossJoin:
                    CrossJoin:
                      TableScan: supplier
                      TableScan: lineitem
                    TableScan: orders
                  TableScan: customer
                SubqueryAlias: n1
                  TableScan: nation
              SubqueryAlias: n2
                TableScan: nation

optimized plan:

Sort: shipping.supp_nation ASC NULLS LAST, shipping.cust_nation ASC NULLS LAST, shipping.l_year ASC NULLS LAST
  Projection: shipping.supp_nation, shipping.cust_nation, shipping.l_year, SUM(shipping.volume) AS revenue
    Aggregate: groupBy=[[shipping.supp_nation, shipping.cust_nation, shipping.l_year]], aggr=[[SUM(shipping.volume)]]
      SubqueryAlias: shipping
        Projection: n1.n_name AS supp_nation, n2.n_name AS cust_nation, datepart(Utf8("YEAR"), lineitem.l_shipdate) AS l_year, CAST(lineitem.l_extendedprice AS Decimal128(38, 4)) * CAST(Decimal128(Some(100),23,2) - CAST(lineitem.l_discount AS Decimal128(23, 2)) AS Decimal128(38, 4)) AS volume
          Filter: n1.n_name = Utf8("FRANCE") AND n2.n_name = Utf8("GERMANY") OR n1.n_name = Utf8("GERMANY") AND n2.n_name = Utf8("FRANCE")
            Projection: lineitem.l_extendedprice, lineitem.l_discount, lineitem.l_shipdate, n1.n_name, n2.n_name
              Inner Join: customer.c_nationkey = n2.n_nationkey
                Projection: lineitem.l_extendedprice, lineitem.l_discount, lineitem.l_shipdate, customer.c_nationkey, n1.n_name
                  Inner Join: supplier.s_nationkey = n1.n_nationkey
                    Projection: supplier.s_nationkey, lineitem.l_extendedprice, lineitem.l_discount, lineitem.l_shipdate, customer.c_nationkey
                      Inner Join: orders.o_custkey = customer.c_custkey
                        Projection: supplier.s_nationkey, lineitem.l_extendedprice, lineitem.l_discount, lineitem.l_shipdate, orders.o_custkey
                          Inner Join: lineitem.l_orderkey = orders.o_orderkey
                            Projection: supplier.s_nationkey, lineitem.l_orderkey, lineitem.l_extendedprice, lineitem.l_discount, lineitem.l_shipdate
                              Inner Join: supplier.s_suppkey = lineitem.l_suppkey
                                TableScan: supplier projection=[s_suppkey, s_nationkey]
                                Filter: lineitem.l_shipdate >= Date32("9131") AND lineitem.l_shipdate <= Date32("9861")
                                  TableScan: lineitem projection=[l_orderkey, l_suppkey, l_extendedprice, l_discount, l_shipdate], partial_filters=[lineitem.l_shipdate >= Date32("9131"), lineitem.l_shipdate <= Date32("9861")]
                            TableScan: orders projection=[o_orderkey, o_custkey]
                        TableScan: customer projection=[c_custkey, c_nationkey]
                    SubqueryAlias: n1
                      Filter: nation.n_name = Utf8("FRANCE") OR nation.n_name = Utf8("GERMANY")
                        TableScan: nation projection=[n_nationkey, n_name], partial_filters=[nation.n_name = Utf8("FRANCE") OR nation.n_name = Utf8("GERMANY")]
                SubqueryAlias: n2
                  Filter: nation.n_name = Utf8("GERMANY") OR nation.n_name = Utf8("FRANCE")
                    TableScan: nation projection=[n_nationkey, n_name], partial_filters=[nation.n_name = Utf8("GERMANY") OR nation.n_name = Utf8("FRANCE")]

@r4ntix
Copy link
Contributor Author

r4ntix commented Mar 31, 2023

This issue has been solved for Ballista: apache/datafusion-ballista#711

@r4ntix r4ntix closed this as completed Mar 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant