Skip to content

Commit

Permalink
Improve merge selecting algorithm for big partitions (#515)
Browse files Browse the repository at this point in the history
  • Loading branch information
yokofly authored Jan 22, 2024
1 parent 6110219 commit 00d0411
Showing 1 changed file with 14 additions and 5 deletions.
19 changes: 14 additions & 5 deletions src/Storages/MergeTree/SimpleMergeSelector.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -154,12 +154,21 @@ void selectWithinPartition(
if (parts_count <= 1)
return;

for (size_t begin = 0; begin < parts_count; ++begin)
/// If the parts in the parts vector are sorted by block number,
/// it may not be ideal to only select parts for merging from the first N ones.
/// This is because if there are more than N parts in the partition,
/// we will not be able to assign a merge for newly created parts.
/// As a result, the total number of parts within the partition could
/// grow uncontrollably, similar to a snowball effect.
/// To address this we will try to assign a merge taking into consideration
/// only last N parts.
static constexpr size_t parts_threshold = 1000;
size_t begin = 0;
if (parts_count >= parts_threshold)
begin = parts_count - parts_threshold;

for (; begin < parts_count; ++begin)
{
/// If too many parts, select only from first, to avoid complexity.
if (begin > 1000)
break;

if (!parts[begin].shall_participate_in_merges)
continue;

Expand Down

0 comments on commit 00d0411

Please sign in to comment.