Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug with intervals and logical and/or #3944

Closed
sarahyurick opened this issue Oct 24, 2022 · 7 comments · Fixed by #4569
Closed

Bug with intervals and logical and/or #3944

sarahyurick opened this issue Oct 24, 2022 · 7 comments · Fixed by #4569
Labels
bug Something isn't working
Milestone

Comments

@sarahyurick
Copy link
Contributor

sarahyurick commented Oct 24, 2022

Describe the bug
While reviewing #3408, I found an additional error when trying to evaluate select i_item_desc from test where d3_date > d1_date + INTERVAL '5 days' and d2_date > d1_date + INTERVAL '3 days':

Error: NotImplemented("Unsupported interval argument. Expected string literal, got: BinaryOp { left: Value(SingleQuotedString(\"5 days\")), op: And, right: BinaryOp { left: Identifier(Ident { value: \"d2_date\", quote_style: None }), op: Gt, right: BinaryOp { left: Identifier(Ident { value: \"d1_date\", quote_style: None }), op: Plus, right: Interval { value: Value(SingleQuotedString(\"3 days\")), leading_field: None, leading_precision: None, last_field: None, fractional_seconds_precision: None } } } }")

To Reproduce
Here is the full code which should reproduce the error:

use datafusion::prelude::*;

#[tokio::main]
async fn main() -> datafusion::error::Result<()> {
  let ctx = SessionContext::new();

  let schema = datafusion::arrow::datatypes::Schema::new(vec![
    datafusion::arrow::datatypes::Field::new("i_item_desc", datafusion::arrow::datatypes::DataType::Utf8, true),
    datafusion::arrow::datatypes::Field::new("d3_date", datafusion::arrow::datatypes::DataType::Date64, true),
    datafusion::arrow::datatypes::Field::new("d1_date", datafusion::arrow::datatypes::DataType::Date64, true),
    datafusion::arrow::datatypes::Field::new("d2_date", datafusion::arrow::datatypes::DataType::Date64, true),
  ]);

  ctx.register_csv(
    "test", "data.csv", CsvReadOptions::default().schema(&schema)
  ).await?;

  let df = ctx.sql("select i_item_desc \
        from test \
        where d3_date > d1_date + INTERVAL '5 days' \
        and d2_date > d1_date + INTERVAL '3 days'").await?;

  df.show().await?;
  Ok(())
}

Expected behavior
Return the expected output with any errors.

Additional context
In my example, data.csv contains:

i_item_desc,d3_date,d1_date,d2_date
a,2022-12-12T7:7:7,2022-12-12T7:7:7,2022-12-12T7:7:7
b,2022-12-12T7:7:7,2022-12-11T7:7:7,2022-12-12T7:7:7
c,2022-12-12T7:7:7,2022-12-10T7:7:7,2022-12-12T7:7:7
d,2022-12-12T7:7:7,2022-12-9T7:7:7,2022-12-12T7:7:7
e,2022-12-12T7:7:7,2022-12-8T7:7:7,2022-12-12T7:7:7
f,2022-12-12T7:7:7,2022-12-7T7:7:7,2022-12-12T7:7:7
g,2022-12-12T7:7:7,2022-12-6T7:7:7,2022-12-12T7:7:7
h,2022-12-12T7:7:7,2022-12-5T7:7:7,2022-12-12T7:7:7
@sarahyurick sarahyurick added the bug Something isn't working label Oct 24, 2022
@andygrove andygrove added this to the 14.0.0 milestone Oct 30, 2022
@HaoYang670
Copy link
Contributor

I am not sure whether this is a bug in sql-parser, cc @andygrove . This is what I get

select interval '5 days' > interval '3 days';
NotImplemented("Unsupported interval argument. Expected string literal, got: BinaryOp { left: Value(SingleQuotedString(\"5 days\")), op: Gt, right: Interval { value: Value(SingleQuotedString(\"3 days\")), leading_field: None, leading_precision: None, last_field: None, fractional_seconds_precision: None } }")
❯ select (interval '5 days') > interval '3 days';
+-----------------------------------------------------------------+
| IntervalDayTime("21474836480") > IntervalDayTime("12884901888") |
+-----------------------------------------------------------------+
| true                                                            |
+-----------------------------------------------------------------+
1 row in set. Query took 0.006 seconds.

@Dandandan
Copy link
Contributor

Looks very similar to

#3408

@sarahyurick
Copy link
Contributor Author

sarahyurick commented Nov 28, 2022

apache/datafusion-sqlparser-rs#705 has been merged, but now it seems like the second condition of something like where d3_date > d1_date + INTERVAL '5 days' and d2_date > d1_date + INTERVAL '3 days' isn't being checked by DataFusion. Currently looking into this.

@sarahyurick
Copy link
Contributor Author

sqlparser-rs/sqlparser-rs#705 has been merged, but now it seems like the second condition of something like where d3_date > d1_date + INTERVAL '5 days' and d2_date > d1_date + INTERVAL '3 days' isn't being checked by DataFusion. Currently looking into this.

This turned out to just be an error with the data file I was working with. Everything seems to work as expected now.

@alamb
Copy link
Contributor

alamb commented Dec 9, 2022

I actually think this issue is not yet actually "fixed" in datafusion as we haven't updated the sqlparser version to 0.28 which includes the change. I will do so later today so reopening this issue

@alamb
Copy link
Contributor

alamb commented Dec 9, 2022

#4568

@alamb
Copy link
Contributor

alamb commented Dec 12, 2022

This is fixed -- I have a PR open to add some regression test coverage: #4569

@alamb alamb closed this as completed Dec 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants