Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

meta: give the size of data which need to be compacted to make LSM tree balanced. #6477

Closed
Little-Wallace opened this issue Nov 21, 2022 · 2 comments
Assignees
Milestone

Comments

@Little-Wallace
Copy link
Contributor

Little-Wallace commented Nov 21, 2022

Is your feature request related to a problem? Please describe.

Background

We hope to give a interface to know how many compactor node we need to scale to speed up our compactor.
So it is important to know how much data need to be compacted to make the LSM tree balanced.
However, it is difficult to know this. Because the shape of this tree is changing dynamically. We can not know the accurate number.

The only thing we can know is that, for the current shape of the LSM tree, how much data we need to compact the next level (Compacting to next level does not mean that we can make the LSM tree balanced). If the size is too large we need to scale more compactor-node at once. So we will give an imprecise reference value, which will be less than the actual amount of resources currently required.

Design

pub trait LevelSelector: Sync + Send {
    fn pending_scheduler_compaction_bytes_mb(&self, levels: &Levels, level_handlers: &[LevelHandler]) -> u64;
}

fn get_scale_compactor_count(&mut self) -> u64 {
   let pending_bytes = self.selector.pending_scheduler_compaction_bytes_mb(&version.levels, &self.level_handlers);
   let scale_scores_count = pending_bytes /  SINGLE_CORE_COMPACT_BYTES_PER_HOUR;
   if scale_scores_count > self.pending_schedule_compactor {
      let scale_count = scale_scores_count - self.pending_schedule_compactor;
      self.pending_schedule_compactor = scale_scores_count;
      return scale_count;
    }
   0
}

Describe alternatives you've considered

No response

Additional context

No response

@github-actions
Copy link
Contributor

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

@hzxa21
Copy link
Collaborator

hzxa21 commented Mar 22, 2023

related: #6986

@fuyufjh fuyufjh closed this as completed Mar 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants