Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] [txn] [PIP-160] Avoid timing task burdens the ledger thread and leads to an avalanche #16679

Merged
merged 8 commits into from
Jul 20, 2022

Conversation

poorbarcode
Copy link
Contributor

@poorbarcode poorbarcode commented Jul 19, 2022

Master Issue: #15370

Motivation

see #15370

Modifications

Managed Ledger I/O thread gets busier and busier.

In origin implementation, TxnLogBufferedWriter has two thread pools:

  • ExecutorService singleThreadExecutorForWrite is A single-threaded actuator, used to perform Managed Ledger I/O operations. Includes the following tasks:
    • internalAsyncAddData Each execution of the asyncAddData adds a task.
    • doFlush The execution of trigFlush sometimes add a task, and there is also a scheduled task that adds tasks.
  • ScheduledExecutorService scheduledExecutorService is used to periodically add doFlush tasks to the singleThreadExecutorForWrite, whether the singleThreadExecutorForWrite is idle or not.

If singleThreadExecutorForWrite is busy for a period of time, scheduledExecutorService will keep adding doFlush tasks during that period. Then singleThreadExecutorForWrite will be busy with the newly added doFlush tasks and allocate fewer resources to internalAsyncAddData while scheduledExecutorService is adding doFlush tasks, which will cause singleThreadExecutorForWrite to accumulate more and more tasks, even if that the doFlush task appended by scheduledExecutorService possible only triggers the check and does nothing else. The net result is that singleThreadExecutorForWrite gets busier and busier.

Imprecise timing task

If new data is added immediately after a scheduled task is executed, the data cannot be flushed in the next scheduled task. Aka. we set the max delay time to 1 min, the scheduled tasks and new data tasks are executed in the following sequence:

1. scheduled task running at 12:00:00.000
2. scheduled task running at 12:00:01.000
3. add data at 12:00:01.005
4. scheduled task running at 12:00:02.000

In the step-4, the flush task will not flush the data to the bookie, because the result of expression System.currentTimeMillis() - firstAsyncAddArgs.addedTime >= batchedWriteMaxDelayInMillis is false, this data will actually flushed at next scheduled task 12:00:03.000

Changes:

Managed Ledger I/O thread gets busier and busier.

Since all C tasks are executed in the same thread, we can change that "after a scheduled task is executed, then add the next one". This reduces the density of doFlush trig by timer task execution, thereby somewhat losing timing accuracy (the original implementation was not completely accurate either).

This change can only avoid the task accumulation in TxnLogBufferedWriter caused by too many scheduled tasks, but cannot improve the write performance of Managed Ledger.

Imprecise timing task

Flush triggered by the scheduled task no longer determines whether the time of the first node reaches the condition. To avoid imprecise scheduled task execution time, the maximum delay time is still checked and flushed in the Flush task triggered by the asyncAddData.

Documentation

  • doc-required

  • doc-not-needed

  • doc

  • doc-complete

@github-actions github-actions bot added the doc-required Your PR changes impact docs and you will update later. label Jul 19, 2022
Comment on lines 161 to 162
scheduledFuture = scheduledExecutorService.schedule(() -> trigFlush(false, true),
batchedWriteMaxDelayInMillis, TimeUnit.MICROSECONDS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
scheduledFuture = scheduledExecutorService.schedule(() -> trigFlush(false, true),
batchedWriteMaxDelayInMillis, TimeUnit.MICROSECONDS);
scheduledFuture = scheduledExecutorService.schedule(() -> trigFlush(false, true),
batchedWriteMaxDelayInMillis, TimeUnit.MILLISECONDS);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed.

}
}).when(managedLedger).asyncAddEntry(Mockito.any(ByteBuf.class), Mockito.any(), Mockito.any());
// Start tests.
TxnLogBufferedWriter txnLogBufferedWriter = new TxnLogBufferedWriter<>(managedLedger, orderedExecutor,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Close the write after test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already fixed.

@codelipenghui codelipenghui added this to the 2.11.0 milestone Jul 19, 2022
@codelipenghui codelipenghui added area/transaction type/bug The PR fixed a bug or issue reported a bug labels Jul 19, 2022
@asafm
Copy link
Contributor

asafm commented Jul 19, 2022

I can't understand this PR.
You have an executor service adding a task to the queue every interval.
You said if the tasks are executed longer that interval, the queue starts building up until it explodes?

I'm not sure I understand the nature of the context in this PR - would love a good brief on that - but could it be that executeWithFixedDelay will solve it, since it never starts the next task until the previous one is finished?

@poorbarcode
Copy link
Contributor Author

poorbarcode commented Jul 19, 2022

Hi @asafm

You said if the tasks are executed longer that interval, the queue starts building up until it explodes?

Yes, I added more descriptions in Motivation.

I'm not sure I understand the nature of the context in this PR - would love a good brief on that - but could it be that executeWithFixedDelay will solve it, since it never starts the next task until the previous one is finished?

This change can only avoid the task accumulation in TxnLogBufferedWriter caused by too many scheduled tasks but cannot improve the write performance of Managed Ledger.

I added more descriptions in Motivation.

Thanks.

Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the failed test

org.apache.pulsar.transaction.coordinator.impl.TxnLogBufferedWriterTest ► testFlushThresholds

Failed test found in:
TEST-org.apache.pulsar.transaction.coordinator.impl.TxnLogBufferedWriterTest.xml
Error:
java.lang.AssertionError: expected [100] but found [101]

Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After checking more details, we missed a part. If we flushed a batch, the next schedule should start by the last flush time + batch delay.

@poorbarcode
Copy link
Contributor Author

After checking more details, we missed a part. If we flushed a batch, the next schedule should start by the last flush time + batch delay.

After flush a batch, the task container is empty, so there is no need to check whether the first task needs to be executed earlier than batchedWriteMaxDelayInMillis

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@poorbarcode
Copy link
Contributor Author

Please check the failed test
org.apache.pulsar.transaction.coordinator.impl.TxnLogBufferedWriterTest ► testFlushThresholds

The flaky Test has been fixed. but the scheduled task timing is not as precise as I remember, I will look at it later

@poorbarcode
Copy link
Contributor Author

/pulsarbot run-failure-checks

@poorbarcode
Copy link
Contributor Author

/pulsarbot run-failure-checks

@congbobo184 congbobo184 added doc-not-needed Your PR changes do not impact docs and removed doc-required Your PR changes impact docs and you will update later. labels Jul 20, 2022
@congbobo184 congbobo184 merged commit 962e445 into apache:master Jul 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/transaction doc-not-needed Your PR changes do not impact docs type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants