Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(source): experimental support for split reduction. #9714

Merged
merged 10 commits into from
May 22, 2023

Conversation

shanicky
Copy link
Contributor

@shanicky shanicky commented May 9, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR experimentally supports split reduction.

When the source in the source manager enables the split scale-in function, the source manager will treat the split reduction as a change instead of ignoring it during the tick, leading to an in-place removal and pushing the assignment downstream. Upon receiving this, the source executor will consider the split reduction as a state change and delete the removed split's state during the commit phase.

Please note that the changes in this PR will need complex testing. cc @tabVersion

generated by gpt

Experimental Support for Split Reduction

This Pull Request introduces several key changes across the scale.rs, source_manager.rs, source_executor.rs, and state_table_handler.rs files.

Main Changes

  1. Refactoring of the reallocate_splits function and its usage​oaicite:{"index":10,"metadata":{"title":"","url":"https://github.com/risingwavelabs/risingwave/pull/9714/files","text":"ac725ec Refactor reallocate_splits function & usage. shanicky May 9, 2023 4db8ea0 Add split removal handling & reallocation shanicky May 9, 2023 bd69c6f Improve source state persistence and trim ability shanicky May 9, 2023 a72bc0b Fix drain_filtermethod for cache. shanicky May 9, 2023 1d6b0a0 Improved split migration handling and memory efficiency shanicky May 10, 2023 674c9ff Async delete & trim_state, log trimming shanicky May 10, 2023 0444abd Refactor SourceExecutor's split tracking and migration check shanicky May 10, 2023 eefd899 ModifySourceExecutormatch expression to add newMutation::Source… shanicky May 10, 2023","pub_date":null}}`​.
  2. Addition of split removal handling and reallocation​oaicite:{"index":11,"metadata":{"title":"","url":"https://github.com/risingwavelabs/risingwave/pull/9714/files","text":"ac725ec Refactor reallocate_splits function & usage. shanicky May 9, 2023 4db8ea0 Add split removal handling & reallocation shanicky May 9, 2023 bd69c6f Improve source state persistence and trim ability shanicky May 9, 2023 a72bc0b Fix drain_filtermethod for cache. shanicky May 9, 2023 1d6b0a0 Improved split migration handling and memory efficiency shanicky May 10, 2023 674c9ff Async delete & trim_state, log trimming shanicky May 10, 2023 0444abd Refactor SourceExecutor's split tracking and migration check shanicky May 10, 2023 eefd899 ModifySourceExecutormatch expression to add newMutation::Source… shanicky May 10, 2023","pub_date":null}}`​.
  3. Improvement of source state persistence and trim capability​oaicite:{"index":12,"metadata":{"title":"","url":"https://github.com/risingwavelabs/risingwave/pull/9714/files","text":"ac725ec Refactor reallocate_splits function & usage. shanicky May 9, 2023 4db8ea0 Add split removal handling & reallocation shanicky May 9, 2023 bd69c6f Improve source state persistence and trim ability shanicky May 9, 2023 a72bc0b Fix drain_filtermethod for cache. shanicky May 9, 2023 1d6b0a0 Improved split migration handling and memory efficiency shanicky May 10, 2023 674c9ff Async delete & trim_state, log trimming shanicky May 10, 2023 0444abd Refactor SourceExecutor's split tracking and migration check shanicky May 10, 2023 eefd899 ModifySourceExecutormatch expression to add newMutation::Source… shanicky May 10, 2023","pub_date":null}}`​.
  4. Fix for the drain_filter method for cache​oaicite:{"index":13,"metadata":{"title":"","url":"https://github.com/risingwavelabs/risingwave/pull/9714/files","text":"ac725ec Refactor reallocate_splits function & usage. shanicky May 9, 2023 4db8ea0 Add split removal handling & reallocation shanicky May 9, 2023 bd69c6f Improve source state persistence and trim ability shanicky May 9, 2023 a72bc0b Fix drain_filtermethod for cache. shanicky May 9, 2023 1d6b0a0 Improved split migration handling and memory efficiency shanicky May 10, 2023 674c9ff Async delete & trim_state, log trimming shanicky May 10, 2023 0444abd Refactor SourceExecutor's split tracking and migration check shanicky May 10, 2023 eefd899 ModifySourceExecutormatch expression to add newMutation::Source… shanicky May 10, 2023","pub_date":null}}`​.
  5. Improved split migration handling and memory efficiency​oaicite:{"index":14,"metadata":{"title":"","url":"https://github.com/risingwavelabs/risingwave/pull/9714/files","text":"ac725ec Refactor reallocate_splits function & usage. shanicky May 9, 2023 4db8ea0 Add split removal handling & reallocation shanicky May 9, 2023 bd69c6f Improve source state persistence and trim ability shanicky May 9, 2023 a72bc0b Fix drain_filtermethod for cache. shanicky May 9, 2023 1d6b0a0 Improved split migration handling and memory efficiency shanicky May 10, 2023 674c9ff Async delete & trim_state, log trimming shanicky May 10, 2023 0444abd Refactor SourceExecutor's split tracking and migration check shanicky May 10, 2023 eefd899 ModifySourceExecutormatch expression to add newMutation::Source… shanicky May 10, 2023","pub_date":null}}`​.
  6. Asynchronous delete & trim_state, log trimming​oaicite:{"index":15,"metadata":{"title":"","url":"https://github.com/risingwavelabs/risingwave/pull/9714/files","text":"ac725ec Refactor reallocate_splits function & usage. shanicky May 9, 2023 4db8ea0 Add split removal handling & reallocation shanicky May 9, 2023 bd69c6f Improve source state persistence and trim ability shanicky May 9, 2023 a72bc0b Fix drain_filtermethod for cache. shanicky May 9, 2023 1d6b0a0 Improved split migration handling and memory efficiency shanicky May 10, 2023 674c9ff Async delete & trim_state, log trimming shanicky May 10, 2023 0444abd Refactor SourceExecutor's split tracking and migration check shanicky May 10, 2023 eefd899 ModifySourceExecutormatch expression to add newMutation::Source… shanicky May 10, 2023","pub_date":null}}`​.
  7. Refactoring of SourceExecutor's split tracking and migration check​oaicite:{"index":16,"metadata":{"title":"","url":"https://github.com/risingwavelabs/risingwave/pull/9714/files","text":"ac725ec Refactor reallocate_splits function & usage. shanicky May 9, 2023 4db8ea0 Add split removal handling & reallocation shanicky May 9, 2023 bd69c6f Improve source state persistence and trim ability shanicky May 9, 2023 a72bc0b Fix drain_filtermethod for cache. shanicky May 9, 2023 1d6b0a0 Improved split migration handling and memory efficiency shanicky May 10, 2023 674c9ff Async delete & trim_state, log trimming shanicky May 10, 2023 0444abd Refactor SourceExecutor's split tracking and migration check shanicky May 10, 2023 eefd899 ModifySourceExecutormatch expression to add newMutation::Source… shanicky May 10, 2023","pub_date":null}}`​.
  8. Modification of SourceExecutor match expression to add a new Mutation::Sourceoaicite:{"index":17,"metadata":{"title":"","url":"https://github.com/risingwavelabs/risingwave/pull/9714/files","text":"ac725ec Refactor reallocate_splits function & usage. shanicky May 9, 2023 4db8ea0 Add split removal handling & reallocation shanicky May 9, 2023 bd69c6f Improve source state persistence and trim ability shanicky May 9, 2023 a72bc0b Fix drain_filtermethod for cache. shanicky May 9, 2023 1d6b0a0 Improved split migration handling and memory efficiency shanicky May 10, 2023 674c9ff Async delete & trim_state, log trimming shanicky May 10, 2023 0444abd Refactor SourceExecutor's split tracking and migration check shanicky May 10, 2023 eefd899 ModifySourceExecutormatch expression to add newMutation::Source… shanicky May 10, 2023","pub_date":null}}`​.

File Changes

  • scale.rs: 12 changes (10 additions, 2 deletions)​oaicite:{"index":18,"metadata":{"title":"","url":"https://github.com/risingwavelabs/risingwave/pull/9714/files","text":"12 changes: 10 additions & 2 deletions 12 src/meta/src/stream/scale.rs","pub_date":null}}​.
  • source_manager.rs: 47 changes (24 additions, 23 deletions)​oaicite:{"index":19,"metadata":{"title":"","url":"https://github.com/risingwavelabs/risingwave/pull/9714/files","text":"47 changes: 24 additions & 23 deletions 47 src/meta/src/stream/source_manager.rs","pub_date":null}}​.

Checklist For Contributors

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
    - [ ] I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • I have demonstrated that backward compatibility is not broken by breaking changes and created issues to track deprecated features to be removed in the future. (Please refer to the issue)
  • All checks passed in ./risedev check (or alias, ./risedev c)

Checklist For Reviewers

  • I have requested macro/micro-benchmarks as this PR can affect performance substantially, and the results are shown.

Documentation

  • My PR DOES NOT contain user-facing changes.
Click here for Documentation

Types of user-facing changes

Please keep the types that apply to your changes, and remove the others.

  • Installation and deployment
  • Connector (sources & sinks)
  • SQL commands, functions, and operators
  • RisingWave cluster configuration changes
  • Other (please specify in the release note below)

Release note

@shanicky shanicky requested a review from tabVersion May 9, 2023 16:06
@shanicky shanicky marked this pull request as draft May 9, 2023 16:37
@shanicky shanicky force-pushed the peng/split-remove branch from 4e74e3f to 0662133 Compare May 9, 2023 16:53
@shanicky shanicky marked this pull request as ready for review May 9, 2023 16:54
@codecov
Copy link

codecov bot commented May 9, 2023

Codecov Report

Merging #9714 (655aa36) into main (709bed0) will decrease coverage by 0.01%.
The diff coverage is 69.23%.

@@            Coverage Diff             @@
##             main    #9714      +/-   ##
==========================================
- Coverage   71.05%   71.04%   -0.01%     
==========================================
  Files        1249     1249              
  Lines      208521   208594      +73     
==========================================
+ Hits       148167   148201      +34     
- Misses      60354    60393      +39     
Flag Coverage Δ
rust 71.04% <69.23%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/meta/src/stream/scale.rs 12.46% <0.00%> (-0.10%) ⬇️
src/meta/src/stream/source_manager.rs 46.41% <0.00%> (+0.13%) ⬆️
src/stream/src/executor/source/source_executor.rs 84.66% <88.23%> (+0.35%) ⬆️
.../stream/src/executor/source/state_table_handler.rs 69.26% <100.00%> (+3.74%) ⬆️

... and 8 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@shanicky shanicky force-pushed the peng/split-remove branch 4 times, most recently from 5e83d1d to eefd899 Compare May 16, 2023 06:38
@shanicky shanicky force-pushed the peng/split-remove branch from 95cb906 to 388a7fb Compare May 16, 2023 09:12
Copy link
Contributor

@tabVersion tabVersion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally LGTM, please make sure a switch can turn off the experimental pr

@tabVersion
Copy link
Contributor

@BugenZhao PTAL

@tabVersion tabVersion requested a review from BugenZhao May 18, 2023 10:37
@shanicky shanicky added this pull request to the merge queue May 22, 2023
@lmatz lmatz mentioned this pull request May 22, 2023
Merged via the queue into main with commit 05d7419 May 22, 2023
@shanicky shanicky deleted the peng/split-remove branch May 22, 2023 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants