Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broken logical plan serialization for aggregation queries #3555

Closed
thinkharderdev opened this issue Sep 21, 2022 · 2 comments · Fixed by #3574
Closed

Broken logical plan serialization for aggregation queries #3555

thinkharderdev opened this issue Sep 21, 2022 · 2 comments · Fixed by #3574
Labels
bug Something isn't working

Comments

@thinkharderdev
Copy link
Contributor

Describe the bug
A clear and concise description of what the bug is.

Some aggregation plans will fail to deserialize. This is caused by a rewrite done by the TypeCoercion optimizer.

To Reproduce
Steps to reproduce the behavior:

    #[tokio::test]
    async fn roundtrip_logical_plan_aggregation() -> Result<(), DataFusionError> {
        let ctx = SessionContext::new();

        let schema = Schema::new(vec![
            Field::new("a", DataType::Int64, true),
            Field::new("b", DataType::Decimal128(15,2), true)
        ]);

        ctx.register_csv("t1", "testdata/test.csv", CsvReadOptions::default().schema(&schema)).await?;

        let query = "SELECT a, SUM(b + 1) FROM t1 GROUP BY a";
        let plan = ctx.sql(query).await?.to_logical_plan()?;

        let bytes =
            logical_plan_to_bytes(&plan)?;
        let logical_round_trip =
            logical_plan_from_bytes(&bytes, &ctx)?;
        assert_eq!(
            format!("{:?}", plan),
            format!("{:?}", logical_round_trip)
        );
        Ok(())

This fails with

Error: SchemaError(FieldNotFound { qualifier: None, name: "SUM(t1.b + Int64(1))", valid_fields: Some(["t1.a", "SUM(t1.b + Decimal128(Some(100),23,2))", "t1.a", "t1.b"]) })

The problem is there is a projection after this which projects the original expression name. Typically this doesn't matter since the schema is unchanged and everything still aligns. But when we deserialize we have to build the plan again and it will fail to validate.

Expected behavior
A clear and concise description of what you expected to happen.

Additional context
Add any other context about the problem here.

@thinkharderdev thinkharderdev added the bug Something isn't working label Sep 21, 2022
@thinkharderdev
Copy link
Contributor Author

@andygrove

@alamb
Copy link
Contributor

alamb commented Sep 21, 2022

cc @liukun4515

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants