[native_datafusion] Add support for reading arrays #1322

andygrove · 2025-01-22T19:52:51Z

What is the problem the feature request solves?

In org.apache.spark.sql.comet.CometNativeScanExec#isAdditionallySupported we currently return false for Array types, therefore we fall back to Spark's scan if the Parquet file contains arrays.

I tried modifying this method to return true for Arrays as long as the element type is supported and saw this error:

Cannot cast file schema field c13 of type List(Field { name: "element", data_type: Boolean, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }) to required schema field of type List(Field { name: "item", data_type: Boolean, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} })

For readability, the from and to types are:

from: List(Field { name: "element", data_type: Boolean, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }
  to: List(Field { name: "item", data_type: Boolean, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} })

The field name is different but the type is the same, so the cast should be supported (and be a no-op).

Describe the potential solution

No response

Additional context

No response

The text was updated successfully, but these errors were encountered:

andygrove added the enhancement New feature or request label Jan 22, 2025

andygrove self-assigned this Jan 22, 2025

andygrove linked a pull request Jan 22, 2025 that will close this issue

chore: Improve array_contains support and add array support to native_datafusion scan #1324

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[native_datafusion] Add support for reading arrays #1322

[native_datafusion] Add support for reading arrays #1322

andygrove commented Jan 22, 2025

[native_datafusion] Add support for reading arrays #1322

[native_datafusion] Add support for reading arrays #1322

Comments

andygrove commented Jan 22, 2025

What is the problem the feature request solves?

Describe the potential solution

Additional context