-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
privilege, domain: reduce the memory jitter of privilege reload activity for 2M users #59487
Conversation
Skipping CI for Draft Pull Request. |
Hi @tiancaiamao. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/test pull-br-integration-test |
@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/retest |
@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/unhold |
[LGTM Timeline notifier]Timeline:
|
/test check-dev2 |
@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/test check-dev2 |
@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/test check-dev2 |
@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
for domain part
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: CbcWestwolf, lance6716, lcwangchao The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: ti-chi-bot <[email protected]>
In response to a cherrypick label: new pull request created to branch |
What problem does this PR solve?
Issue Number: close #59403, ref #55563
Problem Summary:
I create 2M users, and for example, make 10% or 50% of the users active (in-memory).
Then I observe that even when the workload is gone, the tidb-server memory usage jitter periodically.
For example, this one:
What changed and how does it work?
There are several changes.
loadAll()
is used when the active user count > 1024 ... that's the direct root cause of the jitter.That's because
loadSomeUsers()
does not support tooooo many filter condition.The SQL "select * from user where user = 'a' or user = 'b' or user = 'c' or ..." works poorly when there are too many or conditions. This is a known issue #43885 that we write the code the recursive way and cause stackoverflow.
So the first change is to enhance
loadSomeUsers()
to support unlimited user count.It works like this:
SQLExecutor.ExecuteInternal()
streaming API to replaceRestrictedSQLExec.ExecRestrictedSQL()
APIThe problem of
ExecRestrictedSQL
is that the API design not fit here.Its
drainRecordSet
return[]chunk.Row
as result and here it can be 2M huge array.What we need is a streaming API, doing the filter condition at the same time rather than take the whole data set and filter out later.
I suspect there is a leak like #59403, the decode function may using a shallow copy and it references the chunk data.
So the whole chunk cannot be freed.
Check List
Tests
The memory usage now:
The privilege reload activity is every 10min and you can see that the max memory usage is much less than before:
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.