Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use oldest offset on newly detected partitions #7756

Merged
merged 2 commits into from
Nov 12, 2021

Conversation

mcvsubbu
Copy link
Contributor

Description

Fixed the issue #7741 where Pinot was losing data when it detected new stream partitions
(depending on table configuration).

Manual testing by running LLCRealtimeClusterIntegrationTest via debugger

  • Change the table config to start from largest offset
  • Force the test to detect only one partition, notice that roughly
    half the rows are ingested, and only partiton 0 shows up in idealstate
  • Run the RealtimeSegmentValidationManager job via swagger to force detection
    of new partition.
  • Confirmed that all rows are now present.

Upgrade Notes

Does this PR prevent a zero down-time upgrade? (Assume upgrade order: Controller, Broker, Server, Minion)

  • Yes (Please label as backward-incompat, and complete the section below on Release Notes)

Does this PR fix a zero-downtime upgrade introduced earlier?

  • Yes (Please label this as backward-incompat, and complete the section below on Release Notes)

Does this PR otherwise need attention when creating release notes? Things to consider:

  • New configuration options
  • Deprecation of configurations
  • Signature changes to public methods/interfaces
  • New plugins added or old plugins removed
  • Yes (Please label this PR as release-notes and complete the section on Release Notes)

Release Notes

Documentation

Fixed the issue where Pinot was losing data when it detected new stream partitions
(depending on table configuration).

Manual testing by running LLCRealtimeClusterIntegrationTest via debugger
- Change the table config to start from largest offset
- Force the test to detect only one partition, notice that roughly
  half the rows are ingested, and only partiton 0 shows up in idealstate
- Run the RealtimeSegmentValidationManager job via swagger to force detection
  of new partition.
- Confirmed that all rows are now present.

Issue apache#7741
@codecov-commenter
Copy link

codecov-commenter commented Nov 12, 2021

Codecov Report

Merging #7756 (513c0f6) into master (928d4ef) will decrease coverage by 0.07%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #7756      +/-   ##
============================================
- Coverage     71.63%   71.55%   -0.08%     
  Complexity     4064     4064              
============================================
  Files          1577     1577              
  Lines         80542    80589      +47     
  Branches      11965    11974       +9     
============================================
- Hits          57694    57665      -29     
- Misses        18971    19049      +78     
+ Partials       3877     3875       -2     
Flag Coverage Δ
integration1 29.25% <60.00%> (-0.29%) ⬇️
integration2 27.84% <60.00%> (-0.04%) ⬇️
unittests1 68.56% <ø> (-0.06%) ⬇️
unittests2 14.57% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../core/realtime/PinotLLCRealtimeSegmentManager.java 79.04% <100.00%> (+0.45%) ⬆️
...data/manager/realtime/SegmentCommitterFactory.java 64.70% <0.00%> (-29.42%) ⬇️
.../common/request/context/predicate/EqPredicate.java 66.66% <0.00%> (-13.34%) ⬇️
...pinot/common/utils/fetcher/HttpSegmentFetcher.java 61.53% <0.00%> (-10.26%) ⬇️
...er/api/resources/LLCSegmentCompletionHandlers.java 53.46% <0.00%> (-8.92%) ⬇️
...altime/ServerSegmentCompletionProtocolHandler.java 49.52% <0.00%> (-8.58%) ⬇️
...e/data/manager/realtime/SplitSegmentCommitter.java 53.84% <0.00%> (-7.70%) ⬇️
...nction/DistinctCountBitmapAggregationFunction.java 54.40% <0.00%> (-6.74%) ⬇️
...e/pinot/common/utils/FileUploadDownloadClient.java 63.56% <0.00%> (-3.53%) ⬇️
...manager/realtime/HLRealtimeSegmentDataManager.java 80.64% <0.00%> (-2.31%) ⬇️
... and 29 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 928d4ef...513c0f6. Read the comment docs.

String startOffset = partitionGroupMetadata.getStartOffset().toString();
StreamPartitionMsgOffset startOffset;
if (isLiveTable) {
startOffset = getPartitionGroupSmallestOffset(streamConfig, partitionGroupId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is actually overriding the offset criteria and call getNewPartitionGroupMetadataList() to get all PartitionGroupMetadata but only return the offset for one partition. Handling it here is not efficient, and also not avoiding the problem of overriding the stream config. I slightly prefer the fix in #7743

Copy link
Contributor Author

@mcvsubbu mcvsubbu Nov 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not in the performance path, and I am not worried about that at all. Also, the override is very localized, so it is evident as you read the method that it is being overridden (by constructing a new object appropriately named)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Let's go with this approach and probably restructure the code in the future.

@mcvsubbu mcvsubbu merged commit 561b8a3 into apache:master Nov 12, 2021
@mcvsubbu mcvsubbu deleted the issue-7741 branch November 12, 2021 23:04
kriti-sc pushed a commit to kriti-sc/incubator-pinot that referenced this pull request Dec 12, 2021
* Use oldest offset on newly detected partitions

Fixed the issue where Pinot was losing data when it detected new stream partitions
(depending on table configuration).

Manual testing by running LLCRealtimeClusterIntegrationTest via debugger
- Change the table config to start from largest offset
- Force the test to detect only one partition, notice that roughly
  half the rows are ingested, and only partiton 0 shows up in idealstate
- Run the RealtimeSegmentValidationManager job via swagger to force detection
  of new partition.
- Confirmed that all rows are now present.

Issue apache#7741

* Fixed linter error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants