Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][broker] Make CompactedTopicImpl.findStartPointLoop work more efficiently #17976

Merged
merged 5 commits into from
Oct 13, 2022

Conversation

poorbarcode
Copy link
Contributor

Motivation & Modifications

CompactedTopicImpl.findStartPointLoop uses the binary search, but the implementation had a flaw that caused several more loops to be executed. If an array like this: [1,2,3...100], and we search value 2 in this array, CompactedTopicImpl.findStartPointLoop will execute the following loops:

start: 0, mid:50, end: 100
start: 0, mid:25, end: 50
start: 0, mid:12, end: 25
start: 0, mid:6, end: 12
start: 0, mid:3, end: 6
start: 0, mid:1, end: 3
start: 1, mid:1, end: 1    bingo!

We can optimize the loop like this:

start: 0, mid:50, end: 100
start: 1, mid:25, end: 50    bingo!

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch!

The fix looks good to me. How can the test cover the new improvement?


// Sparse ledger makes multi entry has same data, this is used to construct complex environments to verify that the
// smallest position with the correct data is found.
private static final TreeMap<Long,Long> ORIGIN_SPARSE_LEDGER = new TreeMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static final TreeMap<Long,Long> ORIGIN_SPARSE_LEDGER = new TreeMap<>();
private static final TreeMap<Long, Long> ORIGIN_SPARSE_LEDGER = new TreeMap<>();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed

ORIGIN_SPARSE_LEDGER.put(10L,1010L);
ORIGIN_SPARSE_LEDGER.put(20L,1020L);
ORIGIN_SPARSE_LEDGER.put(50L,1050L);
ORIGIN_SPARSE_LEDGER.put(Long.MAX_VALUE,Long.MAX_VALUE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ORIGIN_SPARSE_LEDGER.put(Long.MAX_VALUE,Long.MAX_VALUE);
ORIGIN_SPARSE_LEDGER.put(Long.MAX_VALUE, Long.MAX_VALUE);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed

private static CacheLoader<Long, MessageIdData> mockCacheLoader(long start, long end, final long targetMessageId,
AtomicLong bingoMarker){
// Mock ledger.
final TreeMap<Long,Long> sparseLedger = new TreeMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
final TreeMap<Long,Long> sparseLedger = new TreeMap<>();
final TreeMap<Long, Long> sparseLedger = new TreeMap<>();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed

sparseLedger.putAll(ORIGIN_SPARSE_LEDGER.subMap(start, end + 1));
sparseLedger.put(Long.MAX_VALUE,Long.MAX_VALUE);

Function<Long,Long> findMessageIdFunc = entryId -> sparseLedger.ceilingEntry(entryId).getValue();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Function<Long,Long> findMessageIdFunc = entryId -> sparseLedger.ceilingEntry(entryId).getValue();
Function<Long, Long> findMessageIdFunc = entryId -> sparseLedger.ceilingEntry(entryId).getValue();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed

@codelipenghui codelipenghui added this to the 2.12.0 milestone Oct 10, 2022
@codelipenghui codelipenghui added type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages area/broker labels Oct 10, 2022
@poorbarcode poorbarcode force-pushed the improve/CompactedTopic branch from 88939f8 to f49dc91 Compare October 10, 2022 06:19
@poorbarcode
Copy link
Contributor Author

Hi @codelipenghui

How can the test cover the new improvement?

already append test "testRecursionNumberOfFindStartPointLoop" to cover this case.

* Why should we check the recursion number of "findStartPointLoop", see: #17976
*/
@Test
public void testRecursionNumberOfFindStartPointLoop() throws Exception {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that this function won't throw any Exceptions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already removed. Thanks

Copy link
Contributor

@labuladong labuladong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me.

@poorbarcode
Copy link
Contributor Author

/pulsarbot rerun-failure-checks

Copy link
Member

@coderzc coderzc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@codelipenghui codelipenghui changed the title [improve][broker]Make CompactedTopicImpl.findStartPointLoop work more efficiently [improve][broker] Make CompactedTopicImpl.findStartPointLoop work more efficiently Oct 12, 2022
@codelipenghui codelipenghui merged commit 7628fad into apache:master Oct 13, 2022
@poorbarcode poorbarcode deleted the improve/CompactedTopic branch November 4, 2022 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker doc-not-needed Your PR changes do not impact docs type/enhancement The enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants