Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
colexec: optimize distinct hash aggregation
This commit optimizes the handling of DISTINCT clauses by the hash aggregator. Previously, we would encode the combination of grouping and aggregation columns for each tuple to check whether it hasn't been seen yet. Now the encoding of the grouping columns is done once for the whole bucket, and that encoded value is reused later for every tuple to check. Additionally, this commit splits up the custom hash aggregator helper into two different implementations: one that handles only DISTINCT clauses with no FILTERs, and another that handles any combination of DISTINCT and FILTER clauses. The former can be optimized by converting the aggregation columns of all aggregate functions at once, before the new batch is being aggregated on. For the latter, however, we need to perform the conversion to datums after the filter has applied. Release note: None
- Loading branch information