-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[broker] Cursor status has always been SwitchingLedger and pendingMarkDeleteOps has accumulated tens of thousands of requests #16859
Comments
Could you please provide the version used? If possible, you can upload the dump file ( High light which may cause data leakage of the company ). Could you show the property |
@poorbarcode My version is 2.9.2, the dump file is very large, and it is difficult to upload it on the intranet due to security reasons. The values of the ManagedCursorMXBean mbean are as follows:
|
I now suspect that something is wrong with the |
|
Thanks, then it is not |
@poorbarcode
Then the disk space increases, and some partitions cannot be reclaim. |
Could you show more detail about the error log? |
@poorbarcode I found that there are thousands of tasks in the "BookKeeperClientWorker-OrderedExecutor-59-%d" thread group, and I don't know what caused it. SafeRunnable should catch the exception, but I don't know why this happens |
@hangc0276 Is it normal to have so many tasks in the queue? |
@poorbarcode |
@poorbarcode The stack of the three threads of this thread group is as follows: |
It seems that one topic is busy while others are not, but this seems have not related to the |
@poorbarcode I suspect that when SwitchingLedger, some tasks are blocked in the queue, which makes it impossible to complete the execution.
|
Could you provide the BK configuration?
E.g. Ensemble size = 3, and Write quorum size = 3, When any bookie server does not work, the first write request will timeout, and other requests will backlog |
If bookie is not working, should all thread pools be backlogged, not just that thread pool? |
The issue had no activity for 30 days, mark with Stale label. |
@poorbarcode There is a new development on the question, but I'm not sure if it's the same problem: |
The issue had no activity for 30 days, mark with Stale label. |
This issue might be fixed by #17971 |
Describe the bug
The pendingMarkDeleteOps has accumulated tens of thousands of requests, and the cursor status has always been
SwitchingLedger
, and the retention cannot be executed, resulting in the situation that the disk space cannot be reclaimed.Screenshots

The text was updated successfully, but these errors were encountered: