[REGRESSION] Parquet row group pruning incorrectly prunes out row groups when columns names have .
in them
#5708
Labels
bug
Something isn't working
.
in them
#5708
Describe the bug
Parquet row group pruning incorrectly prunes out row groups when columns names have
.
in themTo Reproduce
Use this file: spans.zip
Run using datafusion-cli:
Expected behavior
However if I disable row group pruning the same query works as expected and returns a single row
Additional context
@jacobmarble found this in IOx: https://github.com/influxdata/influxdb_iox/issues/7225
And has identified that it was a regression introduced in #5419 (see https://github.com/influxdata/influxdb_iox/issues/7225#issuecomment-1472546654) ❤️
Note this will only generate wrong results because there is a column named
"name"
and"service.name"
in the same file (because the pruning logic incorrectly uses the statistics for"name"
for the predicate on"service.name"
If there were no column named
"name"
the predicate would fail to resolve the statistics and they would be ignoredThe text was updated successfully, but these errors were encountered: