Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization rule filter_push_down causes FieldNotFound error #4401

Closed
ZuoTiJia opened this issue Nov 28, 2022 · 8 comments · Fixed by #4447
Closed

Optimization rule filter_push_down causes FieldNotFound error #4401

ZuoTiJia opened this issue Nov 28, 2022 · 8 comments · Fixed by #4447
Labels
bug Something isn't working

Comments

@ZuoTiJia
Copy link
Contributor

ZuoTiJia commented Nov 28, 2022

Describe the bug
Optimization rule filter_push_down causes FieldNotFound error

To Reproduce

CREATE TABLE t AS VALUES (TIMESTAMP '2022-11-28T18:00:00');
SELECT date_trunc('day', column1) AS day FROM t GROUP BY day HAVING day IS NOT NULL;
SELECT date_trunc('day', column1) AS day FROM t GROUP BY day HAVING date_trunc('day', column1) IS NOT NULL;

Error displayed

SchemaError(FieldNotFound { field: Column { relation: None, name: "datetrunc(Utf8(\"day\"),t.column1)" }, valid_fields: Some([Column { relation: Some("t"), name: "column1" }]) })

Expected behavior
When I delete the optimization rule filter_push_down, the statement executes normally.

+---------------------+
| day                 |
+---------------------+
| 2022-11-28T00:00:00 |
+---------------------+

Here is the log about filter_push_down. I added some descriptions about the Expr type.

[2022-11-28T12:37:49Z DEBUG datafusion_optimizer::optimizer] eliminate_outer_join:
    Projection: Column(datetrunc(Utf8("day"),t.column1)) AS day
      Filter: Column(datetrunc(Utf8("day"),t.column1)) IS NOT NULL
        Aggregate: groupBy=[[ScalarFunction(datetrunc(Utf8("day"), Column(t.column1)))]], aggr=[[]]
          TableScan: t
    
[2022-11-28T12:37:49Z DEBUG datafusion_optimizer::optimizer] filter_push_down:
    Projection: Column(datetrunc(Utf8("day"),t.column1)) AS day
      Aggregate: groupBy=[[ScalarFunction(datetrunc(Utf8("day"), Column(t.column1)))]], aggr=[[]]
        Filter: Column(datetrunc(Utf8("day"),t.column1)) IS NOT NULL
          TableScan: t
    
@ZuoTiJia ZuoTiJia added the bug Something isn't working label Nov 28, 2022
@alamb
Copy link
Contributor

alamb commented Nov 28, 2022

Thank you for the report @ZuoTiJia

cc @jackwener

I wonder if #4365 will fix this issue

@jackwener
Copy link
Member

Look like the rule is right

after 
"Projection: datetrunc(Utf8("day"),t.column1) AS day
  Filter: datetrunc(Utf8("day"),t.column1) IS NOT NULL
    Aggregate: groupBy=[[datetrunc(Utf8("day"), t.column1)]], aggr=[[]]
      TableScan: t"

before
"Projection: datetrunc(Utf8("day"),t.column1) AS day
  Aggregate: groupBy=[[datetrunc(Utf8("day"), t.column1)]], aggr=[[]]
    Filter: datetrunc(Utf8("day"),t.column1) IS NOT NULL
      TableScan: t projection=[column1]"

I think it's caused by that filter can't eval expression

pg:
-- create
CREATE TABLE t (
  column1 TIMESTAMP PRIMARY KEY
);

-- insert
INSERT INTO t VALUES ('2022-11-28T18:00:00');

-- fetch 
select * from t where date_trunc('day', column1) is not null;

is ok

CREATE TABLE t AS VALUES (TIMESTAMP '2022-11-28T18:00:00');
select * from t where date_trunc('day', column1) is not null;

failed.

@jackwener
Copy link
Member

I think I can make a temporary solution that only allows pushing down when it is a column.

When we support expression eval in filter, we can rollback.

@alamb
Copy link
Contributor

alamb commented Nov 28, 2022

I wonder if there is something wrong with our HAVING clause support

Specifically, it look like

SELECT date_trunc('day', column1) AS day FROM t GROUP BY day HAVING date_trunc('day', column1) IS NOT NULL;

Is treating date_trunc('day', column1) as a column named "date_trunc('day', column1)" which clearly would not be correct 🤔

@mingmwang
Copy link
Contributor

I think it is bug of filter_push_down. The Filter should not become a Column after pushdown.
I think #4365 should fix the issue.

@jackwener Could you please confirm?

@jackwener
Copy link
Member

I think it is bug of filter_push_down. The Filter should not become a Column after pushdown.
I think #4365 should fix the issue.

😂 My mistake, it actually is a bug, push_down_filter push a column in stead of Expr

@jackwener
Copy link
Member

I have fix it, I don't prepare to mix it into #4365.
Mix more thing into a big PR isn't a good thing, I will new PR after merge it.

@alamb
Copy link
Contributor

alamb commented Nov 30, 2022

I have fix it, I don't prepare to mix it into #4365.
Mix more thing into a big PR isn't a good thing, I will new PR after merge it.

I 100% agree -- thank you @jackwener

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
4 participants