Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reorder filters by selectivity #2021

Open
wants to merge 8 commits into
base: develop
Choose a base branch
from
Open

Conversation

gatesn
Copy link
Contributor

@gatesn gatesn commented Jan 20, 2025

This would have more effect with selection masks. For now, it only helps if selectivity is low enough to trigger a filter before the evaluation.

@gatesn gatesn added the benchmark Run benchmarks on this branch label Jan 20, 2025
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Jan 20, 2025
Copy link
Contributor

github-actions bot commented Jan 20, 2025

Benchmarks: TPC-H

Table of Results
name PR 522dad3 base 1d36132 ratio (PR/base) unit
tpch_q01/arrow 552021442 5.49082e+08 1.00535 ns
tpch_q01/parquet 760076296 7.72487e+08 0.983934 ns
tpch_q01/vortex-file-compressed 521730971 5.3619e+08 0.973034 ns
tpch_q02/arrow 142855487 1.42769e+08 1.00061 ns
tpch_q02/parquet 173013075 1.77336e+08 0.975623 ns
tpch_q02/vortex-file-compressed 145618826 1.50431e+08 0.968014 ns
tpch_q03/arrow 173410415 1.75393e+08 0.988694 ns
tpch_q03/parquet 370990103 3.76328e+08 0.985816 ns
tpch_q03/vortex-file-compressed 236322367 2.36388e+08 0.999721 ns
tpch_q04/arrow 177725578 1.76527e+08 1.00679 ns
tpch_q04/parquet 220166544 2.24674e+08 0.979938 ns
tpch_q04/vortex-file-compressed 175437891 1.75805e+08 0.997913 ns
tpch_q05/arrow 325558756 3.2567e+08 0.999657 ns
tpch_q05/parquet 504624940 5.05141e+08 0.998979 ns
tpch_q05/vortex-file-compressed 379551843 3.78238e+08 1.00347 ns
tpch_q06/arrow 25690362 2.59607e+07 0.989587 ns
tpch_q06/parquet 151334797 1.50801e+08 1.00354 ns
tpch_q06/vortex-file-compressed 71190232 6.52746e+07 1.09063 ns
tpch_q07/arrow 623868485 6.31055e+08 0.988611 ns
tpch_q07/parquet 768793181 7.70841e+08 0.997344 ns
tpch_q07/vortex-file-compressed 646228521 6.50893e+08 0.992833 ns
tpch_q08/arrow 269707954 2.73412e+08 0.986453 ns
tpch_q08/parquet 551298277 5.57444e+08 0.988974 ns
tpch_q08/vortex-file-compressed 359162877 3.61058e+08 0.994753 ns
tpch_q09/arrow 488088470 4.82854e+08 1.01084 ns
tpch_q09/parquet 786040941 7.98464e+08 0.984441 ns
tpch_q09/vortex-file-compressed 602616435 6.0579e+08 0.994761 ns
tpch_q10/arrow 265180735 2.60901e+08 1.0164 ns
tpch_q10/parquet 510160423 5.09425e+08 1.00144 ns
tpch_q10/vortex-file-compressed 281571898 2.83051e+08 0.994773 ns
tpch_q11/arrow 136725079 1.38797e+08 0.985073 ns
tpch_q11/parquet 147990537 1.47443e+08 1.00371 ns
tpch_q11/vortex-file-compressed 136374515 1.34036e+08 1.01745 ns
tpch_q12/arrow 184759679 1.82964e+08 1.00982 ns
tpch_q12/parquet 326727056 3.25584e+08 1.00351 ns
tpch_q12/vortex-file-compressed 252645786 2.60808e+08 0.968702 ns
tpch_q13/arrow 166385743 1.68035e+08 0.990183 ns
tpch_q13/parquet 309432622 3.1406e+08 0.985265 ns
tpch_q13/vortex-file-compressed 182780656 1.79007e+08 1.02108 ns
tpch_q14/arrow 38825078 3.75003e+07 1.03533 ns
tpch_q14/parquet 237491549 2.36181e+08 1.00555 ns
tpch_q14/vortex-file-compressed 80999731 7.9851e+07 1.01439 ns
tpch_q15/arrow 69869511 6.82557e+07 1.02364 ns
tpch_q15/parquet 323009420 3.27624e+08 0.985914 ns
tpch_q15/vortex-file-compressed 139155110 1.34481e+08 1.03476 ns
tpch_q16/arrow 99445463 9.88606e+07 1.00592 ns
tpch_q16/parquet 113253693 1.11792e+08 1.01307 ns
tpch_q16/vortex-file-compressed 103926715 1.03176e+08 1.00728 ns
tpch_q17/arrow 627229307 6.20673e+08 1.01056 ns
tpch_q17/parquet 666843247 7.0725e+08 0.942868 ns
tpch_q17/vortex-file-compressed 608305121 6.19487e+08 0.98195 ns
tpch_q18/arrow 1113229393 1.13064e+09 0.984605 ns
tpch_q18/parquet 1345343730 1.33451e+09 1.00812 ns
tpch_q18/vortex-file-compressed 1156332273 1.17145e+09 0.987097 ns
tpch_q19/arrow 152815042 1.50214e+08 1.01731 ns
tpch_q19/parquet 414270625 4.18875e+08 0.989008 ns
tpch_q19/vortex-file-compressed 265546579 1.50768e+08 1.76129 ns
tpch_q20/arrow 176514713 1.80828e+08 0.976145 ns
tpch_q20/parquet 315552811 3.19009e+08 0.989166 ns
tpch_q20/vortex-file-compressed 214198294 2.16685e+08 0.988525 ns
tpch_q21/arrow 989586540 9.96145e+08 0.993417 ns
tpch_q21/parquet 1090573361 1.12573e+09 0.968771 ns
tpch_q21/vortex-file-compressed 991606414 9.90732e+08 1.00088 ns
tpch_q22/arrow 78924879 7.7492e+07 1.01849 ns
tpch_q22/parquet 111475425 1.08957e+08 1.02311 ns
tpch_q22/vortex-file-compressed 84803854 8.50379e+07 0.997248 ns

Copy link
Contributor

github-actions bot commented Jan 20, 2025

Benchmarks: Clickbench

Table of Results
name PR 522dad3 base 1d36132 ratio (PR/base) unit
clickbench_q00/parquet 1912022 1.84216e+06 1.03792 ns
clickbench_q01/parquet 60978045 6.10587e+07 0.99868 ns
clickbench_q02/parquet 118816758 1.17718e+08 1.00934 ns
clickbench_q03/parquet 84519169 8.19706e+07 1.03109 ns
clickbench_q04/parquet 657055433 6.64052e+08 0.989463 ns
clickbench_q05/parquet 831610591 8.30202e+08 1.0017 ns
clickbench_q06/parquet 1936931 1.94516e+06 0.995772 ns
clickbench_q07/parquet 64571778 6.26823e+07 1.03014 ns
clickbench_q08/parquet 740705150 7.45145e+08 0.994042 ns
clickbench_q09/parquet 1047912594 1.04093e+09 1.00671 ns
clickbench_q10/parquet 258793040 2.53184e+08 1.02215 ns
clickbench_q11/parquet 304203559 3.05789e+08 0.994816 ns
clickbench_q12/parquet 846633136 8.15683e+08 1.03794 ns
clickbench_q13/parquet 1069566172 1.06036e+09 1.00868 ns
clickbench_q14/parquet 833977152 8.40992e+08 0.991659 ns
clickbench_q15/parquet 770181431 7.73472e+08 0.995745 ns
clickbench_q16/parquet 1664399395 1.65755e+09 1.00413 ns
clickbench_q17/parquet 1451371064 1.43566e+09 1.01094 ns
clickbench_q18/parquet 3012062144 3.00494e+09 1.00237 ns
clickbench_q19/parquet 65914564 6.41428e+07 1.02762 ns
clickbench_q20/parquet 1196538859 1.19311e+09 1.00287 ns
clickbench_q21/parquet 1455288527 1.42357e+09 1.02228 ns
clickbench_q22/parquet 2441351204 2.44051e+09 1.00034 ns
clickbench_q23/parquet 8249033336 8.32308e+09 0.991104 ns
clickbench_q24/parquet 532386050 5.30821e+08 1.00295 ns
clickbench_q25/parquet 512752946 5.12806e+08 0.999897 ns
clickbench_q26/parquet 584810362 5.90394e+08 0.990543 ns
clickbench_q27/parquet 1596071229 1.61425e+09 0.988738 ns
clickbench_q28/parquet 11535402568 1.15588e+10 0.997972 ns
clickbench_q29/parquet 419045441 4.37618e+08 0.957559 ns
clickbench_q30/parquet 769127040 7.82011e+08 0.983524 ns
clickbench_q31/parquet 801690605 8.33563e+08 0.961764 ns
clickbench_q32/parquet 2729585730 2.8165e+09 0.969141 ns
clickbench_q33/parquet 2823201931 2.88288e+09 0.979301 ns
clickbench_q34/parquet 2801196003 2.81636e+09 0.994615 ns
clickbench_q35/parquet 857220473 8.61993e+08 0.994464 ns
clickbench_q36/parquet 182940205 1.75881e+08 1.04014 ns
clickbench_q37/parquet 88113580 8.66685e+07 1.01667 ns
clickbench_q38/parquet 114207269 1.14515e+08 0.997313 ns
clickbench_q39/parquet 324560353 3.23325e+08 1.00382 ns
clickbench_q40/parquet 50340133 5.106e+07 0.985901 ns
clickbench_q41/parquet 49522499 4.98389e+07 0.993651 ns
clickbench_q42/parquet 67667251 6.78308e+07 0.997588 ns
clickbench_q00/vortex-file-compressed 1994621 2.04523e+06 0.975257 ns
clickbench_q01/vortex-file-compressed 29223213 2.77991e+07 1.05123 ns
clickbench_q02/vortex-file-compressed 90011918 8.96519e+07 1.00402 ns
clickbench_q03/vortex-file-compressed 78403909 8.04188e+07 0.974945 ns
clickbench_q04/vortex-file-compressed 617063356 6.3389e+08 0.973455 ns
clickbench_q05/vortex-file-compressed 647916990 6.45317e+08 1.00403 ns
clickbench_q06/vortex-file-compressed 2114696 2.11042e+06 1.00203 ns
clickbench_q07/vortex-file-compressed 58778017 5.80498e+07 1.01254 ns
clickbench_q08/vortex-file-compressed 757557064 7.59196e+08 0.997841 ns
clickbench_q09/vortex-file-compressed 952185565 9.59783e+08 0.992084 ns
clickbench_q10/vortex-file-compressed 288348863 2.54635e+08 1.1324 ns
clickbench_q11/vortex-file-compressed 354726994 3.09823e+08 1.14494 ns
clickbench_q12/vortex-file-compressed 582086020 5.90131e+08 0.986367 ns
clickbench_q13/vortex-file-compressed 894569393 9.07128e+08 0.986156 ns
clickbench_q14/vortex-file-compressed 599352128 6.00085e+08 0.998779 ns
clickbench_q15/vortex-file-compressed 765409373 7.40349e+08 1.03385 ns
clickbench_q16/vortex-file-compressed 1408524533 1.40387e+09 1.00331 ns
clickbench_q17/vortex-file-compressed 1350401233 1.30415e+09 1.03547 ns
clickbench_q18/vortex-file-compressed 2888211113 2.93385e+09 0.984442 ns
clickbench_q19/vortex-file-compressed 44714826 4.3393e+07 1.03046 ns
clickbench_q20/vortex-file-compressed 492177798 5.0538e+08 0.973877 ns
clickbench_q21/vortex-file-compressed 907260442 7.71493e+08 1.17598 ns
clickbench_q22/vortex-file-compressed 2137212131 1.9305e+09 1.10708 ns
clickbench_q23/vortex-file-compressed 3970725353 4.00298e+09 0.991943 ns
clickbench_q24/vortex-file-compressed 363689668 3.59923e+08 1.01047 ns
clickbench_q25/vortex-file-compressed 316600252 3.22661e+08 0.981215 ns
clickbench_q26/vortex-file-compressed 417355985 4.18232e+08 0.997906 ns
clickbench_q27/vortex-file-compressed 1373286831 1.40692e+09 0.976093 ns
clickbench_q28/vortex-file-compressed 10617915647 1.07256e+10 0.989958 ns
clickbench_q29/vortex-file-compressed 724449835 6.78528e+08 1.06768 ns
clickbench_q30/vortex-file-compressed 589285765 5.9261e+08 0.99439 ns
clickbench_q31/vortex-file-compressed 621155309 6.20059e+08 1.00177 ns
clickbench_q32/vortex-file-compressed 2780437469 2.79847e+09 0.993556 ns
clickbench_q33/vortex-file-compressed 2186921570 2.22569e+09 0.982581 ns
clickbench_q34/vortex-file-compressed 2186183387 2.21627e+09 0.986423 ns
clickbench_q35/vortex-file-compressed 947108258 9.46139e+08 1.00102 ns
clickbench_q36/vortex-file-compressed 427856383 4.57112e+07 9.35999 ns
clickbench_q37/vortex-file-compressed 341417180 4.25824e+07 8.0178 ns
clickbench_q38/vortex-file-compressed 78170526 3.84227e+07 2.03449 ns
clickbench_q39/vortex-file-compressed 233999090 7.27625e+07 3.21593 ns
clickbench_q40/vortex-file-compressed 77588388 2.88393e+07 2.69037 ns
clickbench_q41/vortex-file-compressed 75195785 3.0271e+07 2.48409 ns
clickbench_q42/vortex-file-compressed 92343706 3.35341e+07 2.75372 ns

Copy link
Contributor

github-actions bot commented Jan 20, 2025

Benchmarks: datafusion

Table of Results
name PR 522dad3 base 1d36132 ratio (PR/base) unit
arrow/planning 931953 955104 0.975761 ns
arrow/exec 1.97963e+06 2.01344e+06 0.983208 ns
vortex-compressed/planning 584665 586488 0.996892 ns
vortex-compressed/exec 2.71743e+06 2.71034e+06 1.00262 ns
vortex-uncompressed/planning 588475 585229 1.00555 ns
vortex-uncompressed/exec 1.56222e+06 1.55866e+06 1.00228 ns

Copy link
Contributor

github-actions bot commented Jan 20, 2025

Benchmarks: random_access

Table of Results
name PR 522dad3 base 1d36132 ratio (PR/base) unit
random-access/vortex-tokio-local-disk 2.51063e+06 2.62011e+06 0.958214 ns
random-access/vortex-local-fs 3.15921e+06 3.30901e+06 0.954732 ns
random-access/parquet-tokio-local-disk 2.26959e+08 2.21792e+08 1.02329 ns

@gatesn
Copy link
Contributor Author

gatesn commented Jan 20, 2025

Hmm, some queries behave really badly here. I think this needs more research. Will close out until we have selection masks that can make more incremental use of selectivity

@gatesn gatesn closed this Jan 20, 2025
Copy link
Contributor

Benchmarks: compress

Table of Results
name PR 000061e base 026610f ratio (PR/base) unit
compress time/taxi 1.58861e+09 1.58859e+09 1.00002 ns
compress time/taxi throughput 0.296366 0.296371 0.999984 bytes/ns
parquet_rs-zstd compress time/taxi 1.81272e+09 1.77489e+09 1.02132 ns
parquet_rs-zstd compress time/taxi throughput 0.259726 0.265262 0.979129 bytes/ns
decompress time/taxi 4.67904e+08 4.72647e+08 0.989965 ns
decompress time/taxi throughput 1.00621 0.996115 1.01014 bytes/ns
parquet_rs-zstd decompress time/taxi 3.11873e+08 3.11428e+08 1.00143 ns
parquet_rs-zstd decompress time/taxi throughput 1.50962 1.51178 0.998574 bytes/ns
compress time/AirlineSentiment 537933 542806 0.991023 ns
compress time/AirlineSentiment throughput 0.00383691 0.00380247 1.00906 bytes/ns
parquet_rs-zstd compress time/AirlineSentiment 56596.4 56509.6 1.00154 ns
parquet_rs-zstd compress time/AirlineSentiment throughput 0.0364687 0.0365248 0.998466 bytes/ns
decompress time/AirlineSentiment 144114 145344 0.991536 ns
decompress time/AirlineSentiment throughput 0.014322 0.0142008 1.00854 bytes/ns
parquet_rs-zstd decompress time/AirlineSentiment 33038.7 32243.6 1.02466 ns
parquet_rs-zstd decompress time/AirlineSentiment throughput 0.0624721 0.0640126 0.975935 bytes/ns
compress time/Arade 2.68996e+09 2.69264e+09 0.999007 ns
compress time/Arade throughput 0.292582 0.292292 1.00099 bytes/ns
parquet_rs-zstd compress time/Arade 3.09136e+09 2.97941e+09 1.03758 ns
parquet_rs-zstd compress time/Arade throughput 0.254592 0.264158 0.963784 bytes/ns
decompress time/Arade 6.8341e+08 6.99178e+08 0.977448 ns
decompress time/Arade throughput 1.15163 1.12566 1.02307 bytes/ns
parquet_rs-zstd decompress time/Arade 6.7617e+08 6.81113e+08 0.992743 ns
parquet_rs-zstd decompress time/Arade throughput 1.16396 1.15551 1.00731 bytes/ns
compress time/Bimbo 1.17877e+10 1.1819e+10 0.997353 ns
compress time/Bimbo throughput 0.604134 0.602535 1.00265 bytes/ns
parquet_rs-zstd compress time/Bimbo 2.29569e+10 2.11679e+10 1.08452 ns
parquet_rs-zstd compress time/Bimbo throughput 0.310205 0.336422 0.922071 bytes/ns
decompress time/Bimbo 4.43038e+09 4.36811e+09 1.01425 ns
decompress time/Bimbo throughput 1.60739 1.6303 0.985946 bytes/ns
parquet_rs-zstd decompress time/Bimbo 3.59554e+09 3.48228e+09 1.03252 ns
parquet_rs-zstd decompress time/Bimbo throughput 1.98061 2.04502 0.968501 bytes/ns
compress time/CMSprovider 1.28961e+10 1.28628e+10 1.00259 ns
compress time/CMSprovider throughput 0.399284 0.400317 0.99742 bytes/ns
parquet_rs-zstd compress time/CMSprovider 1.92395e+10 1.86607e+10 1.03102 ns
parquet_rs-zstd compress time/CMSprovider throughput 0.267636 0.275938 0.969916 bytes/ns
decompress time/CMSprovider 4.43735e+09 4.43332e+09 1.00091 ns
decompress time/CMSprovider throughput 1.16042 1.16148 0.999092 bytes/ns
parquet_rs-zstd decompress time/CMSprovider 5.60087e+09 5.59534e+09 1.00099 ns
parquet_rs-zstd decompress time/CMSprovider throughput 0.919356 0.920266 0.999012 bytes/ns
compress time/Euro2016 2.1607e+09 2.16403e+09 0.998462 ns
compress time/Euro2016 throughput 0.182004 0.181724 1.00154 bytes/ns
parquet_rs-zstd compress time/Euro2016 1.56215e+09 1.54828e+09 1.00896 ns
parquet_rs-zstd compress time/Euro2016 throughput 0.251741 0.253995 0.991124 bytes/ns
decompress time/Euro2016 2.47565e+08 2.50505e+08 0.988266 ns
decompress time/Euro2016 throughput 1.5885 1.56986 1.01187 bytes/ns
parquet_rs-zstd decompress time/Euro2016 4.87831e+08 4.87685e+08 1.0003 ns
parquet_rs-zstd decompress time/Euro2016 throughput 0.806132 0.806372 0.999702 bytes/ns
compress time/Food 1.08146e+09 1.09099e+09 0.991265 ns
compress time/Food throughput 0.30766 0.304972 1.00881 bytes/ns
parquet_rs-zstd compress time/Food 1.09875e+09 1.05425e+09 1.04221 ns
parquet_rs-zstd compress time/Food throughput 0.302818 0.315599 0.959502 bytes/ns
decompress time/Food 1.76064e+08 1.78855e+08 0.984396 ns
decompress time/Food throughput 1.88977 1.86028 1.01585 bytes/ns
parquet_rs-zstd decompress time/Food 2.24067e+08 2.25891e+08 0.991926 ns
parquet_rs-zstd decompress time/Food throughput 1.48492 1.47293 1.00814 bytes/ns
compress time/HashTags 2.51552e+09 2.52615e+09 0.995793 ns
compress time/HashTags throughput 0.319816 0.318471 1.00423 bytes/ns
parquet_rs-zstd compress time/HashTags 2.51101e+09 2.45795e+09 1.02159 ns
parquet_rs-zstd compress time/HashTags throughput 0.32039 0.327306 0.97887 bytes/ns
decompress time/HashTags 4.24011e+08 4.22009e+08 1.00474 ns
decompress time/HashTags throughput 1.89736 1.90636 0.995278 bytes/ns
parquet_rs-zstd decompress time/HashTags 7.82484e+08 7.77957e+08 1.00582 ns
parquet_rs-zstd decompress time/HashTags throughput 1.02814 1.03412 0.994215 bytes/ns
compress time/TPC-H l_comment chunked without fsst 3.00228e+09 2.98388e+09 1.00617 ns
compress time/TPC-H l_comment chunked without fsst throughput 0.0830106 0.0835226 0.993869 bytes/ns
parquet_rs-zstd compress time/TPC-H l_comment chunked without fsst 9.02867e+08 8.99631e+08 1.0036 ns
parquet_rs-zstd compress time/TPC-H l_comment chunked without fsst throughput 0.276034 0.277026 0.996416 bytes/ns
decompress time/TPC-H l_comment chunked without fsst 5.53748e+07 5.61888e+07 0.985514 ns
decompress time/TPC-H l_comment chunked without fsst throughput 4.50063 4.43543 1.0147 bytes/ns
parquet_rs-zstd decompress time/TPC-H l_comment chunked without fsst 2.51676e+08 2.51472e+08 1.00081 ns
parquet_rs-zstd decompress time/TPC-H l_comment chunked without fsst throughput 0.990247 0.991052 0.999188 bytes/ns
compress time/TPC-H l_comment chunked 9.98583e+08 9.97477e+08 1.00111 ns
compress time/TPC-H l_comment chunked throughput 0.249575 0.249852 0.998893 bytes/ns
parquet_rs-zstd compress time/TPC-H l_comment chunked 9.03167e+08 8.99409e+08 1.00418 ns
parquet_rs-zstd compress time/TPC-H l_comment chunked throughput 0.275942 0.277095 0.995839 bytes/ns
decompress time/TPC-H l_comment chunked 1.01799e+08 1.02271e+08 0.995379 ns
decompress time/TPC-H l_comment chunked throughput 2.44817 2.43686 1.00464 bytes/ns
parquet_rs-zstd decompress time/TPC-H l_comment chunked 2.51298e+08 2.52038e+08 0.997065 ns
parquet_rs-zstd decompress time/TPC-H l_comment chunked throughput 0.991738 0.988826 1.00294 bytes/ns
compress time/TPC-H l_comment canonical 9.97166e+08 1.00482e+09 0.99238 ns
compress time/TPC-H l_comment canonical throughput 0.249929 0.248025 1.00768 bytes/ns
parquet_rs-zstd compress time/TPC-H l_comment canonical 9.05327e+08 9.01385e+08 1.00437 ns
parquet_rs-zstd compress time/TPC-H l_comment canonical throughput 0.275283 0.276487 0.995645 bytes/ns
decompress time/TPC-H l_comment canonical 1.01494e+08 1.0226e+08 0.992511 ns
decompress time/TPC-H l_comment canonical throughput 2.45552 2.43713 1.00755 bytes/ns
parquet_rs-zstd decompress time/TPC-H l_comment canonical 2.50781e+08 2.51829e+08 0.995838 ns
parquet_rs-zstd decompress time/TPC-H l_comment canonical throughput 0.993779 0.989643 1.00418 bytes/ns

@gatesn gatesn reopened this Jan 23, 2025
@gatesn gatesn added the benchmark Run benchmarks on this branch label Jan 23, 2025
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant