colexec: implement vectorized table statistics collection #54803

yuzefovich · 2020-09-25T16:08:41Z

Statistics from the tables are currently collected using a combination of row-execution sampler and sampleAggregator processors which have been introduced before we began implementing the vectorized engine. I believe that stats collection will benefit noticeably from the vectorized approach (similar benefits as in the other use cases - faster execution, better memory management). In particular, I'm hoping that it will alleviate issues like #54670.

The text was updated successfully, but these errors were encountered:

asubiotto · 2020-09-30T14:33:16Z

@RaduBerinde I think the optimizer team needs to prioritize this for 21.1 given that #54670 is an easily reproducible OOM and we've seen a lot of OOMs related to wide rows lately. We're happy to offer guidance here.

RaduBerinde · 2020-09-30T14:59:35Z

@rytaft I thought we had hit OOMs with large rows before and had a workaround in place, do you remember the details? I think it was around dynamically adjusting the batch size or something along those lines

rytaft · 2020-09-30T16:29:12Z

Maybe you're thinking of #40850? I didn't end up merging that PR since there were some performance issues identified. Seems like we should just switch to vectorized to fix this.

RaduBerinde · 2020-09-30T16:43:05Z

I see. Let's use the existing #41203 to track this.

yuzefovich added the C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) label Sep 25, 2020

yuzefovich mentioned this issue Sep 25, 2020

sql: a crash when inserting large rows (probably due to stats collection) #54670

Closed

RaduBerinde closed this as completed Sep 30, 2020

RaduBerinde mentioned this issue Sep 30, 2020

sql: implement vectorized operator for table statistics collection #41203

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

colexec: implement vectorized table statistics collection #54803

colexec: implement vectorized table statistics collection #54803

yuzefovich commented Sep 25, 2020

asubiotto commented Sep 30, 2020

RaduBerinde commented Sep 30, 2020

rytaft commented Sep 30, 2020

RaduBerinde commented Sep 30, 2020

colexec: implement vectorized table statistics collection #54803

colexec: implement vectorized table statistics collection #54803

Comments

yuzefovich commented Sep 25, 2020

asubiotto commented Sep 30, 2020

RaduBerinde commented Sep 30, 2020

rytaft commented Sep 30, 2020

RaduBerinde commented Sep 30, 2020