Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upcast types during union schema creation. #5212

Closed
mustafasrepo opened this issue Feb 7, 2023 · 0 comments · Fixed by #5342
Closed

Upcast types during union schema creation. #5212

mustafasrepo opened this issue Feb 7, 2023 · 0 comments · Fixed by #5342
Labels
bug Something isn't working

Comments

@mustafasrepo
Copy link
Contributor

mustafasrepo commented Feb 7, 2023

Describe the bug
A clear and concise description of what the bug is.
When I run the query below on postgre

SELECT c1, c9 FROM aggregate_test_100 
UNION ALL 
SELECT c1, c3 FROM aggregate_test_100

where c9 has type Bigint and c3 has type smallint. It produces a valid result. However, when I run the above query on datafusion where c9 has type Uint32 and c3 has type Int8.
It gives the error ArrowError(CastError("Can't cast value 1491205016 to type Int8")).
The physical plan of the query above in DataFusion is as follows

"UnionExec",
"  ProjectionExec: expr=[c1@0 as c1, c3@1 as c3]",
"    CsvExec: files={1 group: [[Users/akurmustafa/projects/synnada/arrow-datafusion-tmp/testing/data/csv/aggregate_test_100.csv]]}, has_header=true, limit=None, projection=[c1, c3]",
"  ProjectionExec: expr=[c1@0 as c1, CAST(c9@1 AS Int8) as c3]",
"    CsvExec: files={1 group: [[Users/akurmustafa/projects/synnada/arrow-datafusion-tmp/testing/data/csv/aggregate_test_100.csv]]}, has_header=true, limit=None, projection=[c1, c9]",

Datafusion coerces the types DataType::Uint32 and DataType::Int8 to DataType::Int8. For instance we may choose to upcast types to DataType::Int64 for this specific case.

To Reproduce
Steps to reproduce the behavior:
One can run query above

Expected behavior
A clear and concise description of what you expected to happen.
I expect above query to work

Additional context
Add any other context about the problem here.

@mustafasrepo mustafasrepo added the bug Something isn't working label Feb 7, 2023
@mustafasrepo mustafasrepo changed the title Do not coerce types during union schema creation. Upcast types during union schema creation. Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant