-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(compactor): calculate pending bytes for scale compactor #6497
feat(compactor): calculate pending bytes for scale compactor #6497
Conversation
Signed-off-by: Little-Wallace <[email protected]>
Related to #6483 ? |
Codecov Report
@@ Coverage Diff @@
## main #6497 +/- ##
==========================================
+ Coverage 73.35% 73.37% +0.02%
==========================================
Files 1010 1010
Lines 161906 162215 +309
==========================================
+ Hits 118763 119024 +261
- Misses 43143 43191 +48
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
@@ -82,6 +88,81 @@ impl DynamicLevelSelector { | |||
inner: LevelSelectorCore::new(config, overlap_strategy), | |||
} | |||
} | |||
|
|||
fn calculate_l0_overlap( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please corret me if I am wrong, this method calculates "total size of non-pending L0 file size + total size of base level SSTs overlapping with the non-pending L0 SSTs". Based on this, I have the following questions:
- Shouldn't we use
if !handlers[0].is_pending_compact(&table_info.id)
in L102 so that we are actually use non-pending L0 SSTs to calculate overlap? output_files
in L108 represents the pending SSTs in base level that output to base level itself. IIUC, we should consider these SSTs in next_level_size instead of ignoring them in L116. To be more precise, I think we should consider non-pending SSTs and pending SSTs with target level == base level for next_level_size.- Do we intentionally ignore L0 sub-level compaction?
if compact_bytes + compacting_file_size + target_bytes >= select_level.total_file_size { | ||
break; | ||
} | ||
if output_files.contains(&sst.id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we ignore pending SSTs output to select level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
} | ||
} | ||
let mut next_level_size = 0; | ||
let next_level_files = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic is understandable now, but the variable names make it easy for me to get lost while reading
I tried to modify several names to distinguish between l0 and base_level, how about this?
total_level_size -> l0_total_level_size
next_level_files -> pending_base_level_files
next_level_size -> base_level_size
handlers: &[LevelHandler], | ||
) -> u64 { | ||
if select_level.total_file_size <= target_bytes { | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we use zero
instead of an exact value when <= target_bytes
? Is it because the next compact_task won't be spawned?
level_handlers: &[LevelHandler], | ||
) -> u64 { | ||
let ctx = self.inner.calculate_level_base_size(levels); | ||
let mut pending_compaction_bytes = self.calculate_l0_overlap( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to l0_pending_compaction_bytes
|
||
use crate::hummock::compaction::level_selector::{DynamicLevelSelector, LevelSelector}; | ||
use crate::hummock::compaction::manual_compaction_picker::ManualCompactionSelector; | ||
use crate::hummock::compaction::overlap_strategy::{OverlapStrategy, RangeOverlapStrategy}; | ||
use crate::hummock::level_handler::LevelHandler; | ||
|
||
// we assume that every core could compact data with 50MB/s, and when there has been 32GB data | ||
// waiting to compact, a new compactor-node with 8-core could consume this data with in 2 minutes. | ||
const COMPACTION_BYTES_PER_CORE: u64 = 4 * 1024 * 1024 * 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumption may not necessarily be true for all machine types, do we need to use it as a configuration ? or add a TODO in this PR
< compactor.max_concurrent_task_number() | ||
{ | ||
.unwrap_or(&0); | ||
if running_task < 2 * compactor.max_concurrent_task_number() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this Pr, the limit of running_task for per compactor is max_concurrent_task_number * 2
when Scheduler in Burst
state, right ?
Signed-off-by: Little-Wallace <[email protected]>
Signed-off-by: Little-Wallace <[email protected]>
This PR has been open for 60 days with no activity. Could you please update the status? Feel free to ping a reviewer if you are waiting for review. |
@Little-Wallace @Li0k Any updates? Are we still working on this? |
Signed-off-by: Little-Wallace [email protected]
I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.
What's changed and what's your intention?
PLEASE DO NOT LEAVE THIS EMPTY !!!
Please explain IN DETAIL what the changes are in this PR and why they are needed:
Checklist
./risedev check
(or alias,./risedev c
)Documentation
If your pull request contains user-facing changes, please specify the types of the changes, and create a release note. Otherwise, please feel free to remove this section.
Types of user-facing changes
Please keep the types that apply to your changes, and remove those that do not apply.
Release note
Please create a release note for your changes. In the release note, focus on the impact on users, and mention the environment or conditions where the impact may occur.
Refer to a related PR or issue link (optional)
#6477