Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ideal state as source of truth for segment existence #9735

Merged
merged 1 commit into from
Nov 5, 2022

Conversation

Jackie-Jiang
Copy link
Contributor

When deleting a segment, we delete the segment from ideal state synchronously, then use a separate thread to delete the segment from property store and deep store asynchronously. This could cause inconsistency between ideal state and segments in the property store, and we should always use ideal state as the source of truth for the segments because that is what Helix is following. The segment ZK metadata from the property store can be used to track the orphan segments that are not cleaned up property.

@codecov-commenter
Copy link

codecov-commenter commented Nov 5, 2022

Codecov Report

Merging #9735 (67fb19e) into master (9898b91) will decrease coverage by 1.61%.
The diff coverage is 51.72%.

@@             Coverage Diff              @@
##             master    #9735      +/-   ##
============================================
- Coverage     70.04%   68.43%   -1.62%     
- Complexity     4907     4972      +65     
============================================
  Files          1951     1951              
  Lines        104512   104515       +3     
  Branches      15824    15827       +3     
============================================
- Hits          73210    71525    -1685     
- Misses        26175    27892    +1717     
+ Partials       5127     5098      -29     
Flag Coverage Δ
integration1 25.27% <43.10%> (-0.02%) ⬇️
integration2 ?
unittests1 67.55% <ø> (-0.04%) ⬇️
unittests2 15.67% <36.20%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ntroller/helix/core/PinotHelixResourceManager.java 69.61% <50.00%> (-1.74%) ⬇️
...ler/api/resources/PinotSegmentRestletResource.java 21.14% <100.00%> (-18.40%) ⬇️
...pinot/core/data/manager/realtime/TimerService.java 0.00% <0.00%> (-100.00%) ⬇️
...t/core/plan/StreamingInstanceResponsePlanNode.java 0.00% <0.00%> (-100.00%) ⬇️
...ore/operator/streaming/StreamingResponseUtils.java 0.00% <0.00%> (-100.00%) ⬇️
...server/starter/helix/SegmentReloadStatusValue.java 0.00% <0.00%> (-100.00%) ⬇️
...ager/realtime/PeerSchemeSplitSegmentCommitter.java 0.00% <0.00%> (-100.00%) ⬇️
...urces/ServerReloadControllerJobStatusResponse.java 0.00% <0.00%> (-100.00%) ⬇️
...he/pinot/common/utils/grpc/GrpcRequestBuilder.java 0.00% <0.00%> (-90.91%) ⬇️
...ator/streaming/StreamingSelectionOnlyOperator.java 0.00% <0.00%> (-90.00%) ⬇️
... and 145 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@walterddr walterddr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm one question

@@ -730,22 +730,29 @@ public String getActualTableName(String tableName) {
*/

/**
* Returns the segments for the given table.
* Returns the segments for the given table from the ideal state.
*
* @param tableNameWithType Table name with type suffix
* @param shouldExcludeReplacedSegments whether to return the list of segments that doesn't contain replaced segments.
* @return List of segment names
*/
public List<String> getSegmentsFor(String tableNameWithType, boolean shouldExcludeReplacedSegments) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getSegmentsFor is originally also used in SegmentRelocator and RetentionManager are those 2 usage both IS as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Checked all the usage of them and moved the ones to clean up the orphan segments (e.g. when deleting the table or all segments) to use getSegmentsFromPropertyStore()

@Jackie-Jiang Jackie-Jiang merged commit d530695 into apache:master Nov 5, 2022
@Jackie-Jiang Jackie-Jiang deleted the fix_get_segments branch November 5, 2022 04:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants