Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: add global system variable tmp_table_size #24827

Merged
merged 37 commits into from
Jun 9, 2021

Conversation

tiancaiamao
Copy link
Contributor

@tiancaiamao tiancaiamao commented May 21, 2021

What problem does this PR solve?

Problem Summary:

Use system variable tmp_table_size is used to control the temporary table size threshold.

What is changed and how it works?

What's Changed:

  • add global + session system variable tmp_table_size
  • check the temporary table do not exceed this threshold

How it Works:

We can't use the txn.Size() directly, because the data may not located in the temporary table.
So I use the txn size before & after the AddRecord() operation to calculate the temporary table size.

Check List

Tests

  • Unit test

Need to Update document too

Release note

  • No release note

@tiancaiamao tiancaiamao requested a review from djshow832 May 21, 2021 08:52
@ti-chi-bot ti-chi-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 21, 2021
@tiancaiamao tiancaiamao mentioned this pull request May 21, 2021
89 tasks
@tiancaiamao
Copy link
Contributor Author

/rebuild

@github-actions github-actions bot added the sig/sql-infra SIG: SQL Infra label May 21, 2021
@@ -1501,6 +1501,12 @@ var defaultSysVars = []*SysVar{
{Scope: ScopeGlobal, Name: TiDBGCLifetime, Value: "10m0s", Type: TypeDuration, MinValue: int64(time.Minute * 10), MaxValue: math.MaxInt64},
{Scope: ScopeGlobal, Name: TiDBGCConcurrency, Value: "-1", Type: TypeInt, MinValue: 1, MaxValue: 128, AllowAutoValue: true},
{Scope: ScopeGlobal, Name: TiDBGCScanLockMode, Value: "PHYSICAL", Type: TypeEnum, PossibleValues: []string{"PHYSICAL", "LEGACY"}},

// See https://dev.mysql.com/doc/refman/8.0/en/server-system-variables.html#sysvar_temptable_max_ram
{Scope: ScopeGlobal, Name: TempTableMaxRAM, Value: strconv.Itoa(DefTempTableMaxRAM), Type: TypeInt, MinValue: 2097152, MaxValue: math.MaxInt64, SetGlobal: func(s *SessionVars, val string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For MySQL, the max value is MaxUint64.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's okay to not follow strictly, but there is an issue in that the internal value is a signed int32, but the sysvar allows up to signed int64. The two types should align.

Signed int32 only allows up to 2GB temporary tables, which is "okay", but there might be some cases where it needs to be larger.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still set to have a maximum value of math.MaxInt64, but the internal value is an int32.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to be conservative.
We can update the value to be larger anytime as long as we'd like to, but if we set it's initial value huge and turn down the value in a future release, there will be a compatibility issue.
And now I take @morgo 's advise and set it to math.MaxInt32

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 24, 2021
@djshow832
Copy link
Contributor

Please resolve conflicts, rest LGTM. @tiancaiamao

@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 25, 2021
@djshow832
Copy link
Contributor

/lgtm

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label May 25, 2021
table/tables/tables.go Outdated Show resolved Hide resolved
table/tables/tables.go Show resolved Hide resolved
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 31, 2021
@tisonkun
Copy link
Contributor

tisonkun commented Jun 1, 2021

Please refer the pull request to its issue #24169 and it would be better if we create a dedicated sub-issue for this pr.

@djshow832
Copy link
Contributor

@tiancaiamao Please create a sub-issue and resolve conflicts, and then ask @morgo and @wjhuang2016 to review it again.

@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 2, 2021
@tiancaiamao tiancaiamao changed the title *: add global system variable temptable_max_ram *: add global system variable tidb_temptable_max_ram Jun 3, 2021
@tiancaiamao
Copy link
Contributor Author

Since in MySQL8 temptable_max_ram is deprecated, we introduce another TiDB specified variable tidb_temptable_max_ram

PTAL @djshow832 @wjhuang2016 @morgo

Copy link
Member

@wjhuang2016 wjhuang2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restLGTM

table/tables/tables.go Outdated Show resolved Hide resolved
@morgo
Copy link
Contributor

morgo commented Jun 3, 2021

@tiancaiamao As discussed. There are two settings which apply in MySQL 5.7 here:

They are used together, so the smaller limit is what applies to in-memory temporary tables. These limits are enforced per-temporary table, which is the same semantic as what is used in this PR. The buffer for temptable_max_ram in MySQL 8.0 is a global buffer pool shared by all connections (as an administrator I prefer this behavior, as it's easier to plan provisioning of space for temporary queries). I think it's best to avoid this naming convention to prevent confusion.

Where tmp_table_size and max_heap_table_size differ from this implementation is that TiDB currently has no spill to disk mechanism. I actually think that is okay. If you create a temporary table in MySQL and you explicitly say the engine=MEMORY, it will also not spill to disk when the threshold is reached. You will instead get a table-is-full error.

The bigger difference is that tmp_table_size and max_heap_table_size apply to both implicit and explicit temporary tables, whereas in TiDB this setting will only apply to explicit temporary tables. There are existing mechanisms for implicity temporary tables which remain unchanged.

@tiancaiamao
Copy link
Contributor Author

@tiancaiamao As discussed. There are two settings which apply in MySQL 5.7 here:

* [tmp_table_size](https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_tmp_table_size)

* [max_heap_table_size](https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_max_heap_table_size)

They are used together, so the smaller limit is what applies to in-memory temporary tables. These limits are enforced per-temporary table, which is the same semantic as what is used in this PR. The buffer for temptable_max_ram in MySQL 8.0 is a global buffer pool shared by all connections (as an administrator I prefer this behavior, as it's easier to plan provisioning of space for temporary queries). I think it's best to avoid this naming convention to prevent confusion.

Where tmp_table_size and max_heap_table_size differ from this implementation is that TiDB currently has no spill to disk mechanism. I actually think that is okay. If you create a temporary table in MySQL and you explicitly say the engine=MEMORY, it will also not spill to disk when the threshold is reached. You will instead get a table-is-full error.

The bigger difference is that tmp_table_size and max_heap_table_size apply to both implicit and explicit temporary tables, whereas in TiDB this setting will only apply to explicit temporary tables. There are existing mechanisms for implicity temporary tables which remain unchanged.

So complicated ...

max_heap_table_size is a limitation for user-created MEMORY table, so it works on create table ... () engine = memory?
but it also works on internal in-memory tables, that is to say -- temporary table?

tmp_table_size works on the temporary tables, but its semantic is the spill to disk threshold...

@tiancaiamao
Copy link
Contributor Author

I'll change to use the variable to tmp_table_size.
If we implement the spill to disk feature for the temporary table in development of stage 3,
we can modify it to be totally compatible with MySQL.

For now, just raise an error when data size excess tmp_table_size.
And the scope of tmp_table_size is global + session, just as MySQL.

@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 3, 2021
@bb7133
Copy link
Member

bb7133 commented Jun 8, 2021

/merge cancel

@ti-chi-bot ti-chi-bot removed the status/can-merge Indicates a PR has been approved by a committer. label Jun 8, 2021
Copy link
Member

@wjhuang2016 wjhuang2016 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the CI

@tiancaiamao tiancaiamao requested a review from a team as a code owner June 9, 2021 09:09
@tiancaiamao tiancaiamao requested review from XuHuaiyu and removed request for a team June 9, 2021 09:09
@@ -896,9 +896,9 @@ func (s *testSuite5) TestValidateSetVar(c *C) {
result = tk.MustQuery("select @@tmp_table_size;")
result.Check(testkit.Rows("167772161"))

tk.MustExec("set @@tmp_table_size=18446744073709551615")
tk.MustExec("set @@tmp_table_size=9223372036854775807")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is because MySQL set the value to Uint64, while in TiDB it's set to Int64

@tiancaiamao
Copy link
Contributor Author

Please fix the CI

Done, please take a look @wjhuang2016 @djshow832

@wjhuang2016
Copy link
Member

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: a6c14f7

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jun 9, 2021
@ti-chi-bot ti-chi-bot removed the status/can-merge Indicates a PR has been approved by a committer. label Jun 9, 2021
@github-actions github-actions bot added the sig/execution SIG execution label Jun 9, 2021
@tiancaiamao
Copy link
Contributor Author

/merge

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 754ded5

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Jun 9, 2021
@ti-chi-bot
Copy link
Member

@tiancaiamao: Your PR was out of date, I have automatically updated it for you.

At the same time I will also trigger all tests for you:

/run-all-tests

If the CI test fails, you just re-trigger the test that failed and the bot will merge the PR for you after the CI passes.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot ti-chi-bot merged commit cbb6f4e into pingcap:master Jun 9, 2021
@tiancaiamao tiancaiamao deleted the temptable-max-ram branch June 18, 2021 14:04
tiancaiamao added a commit to tiancaiamao/tidb that referenced this pull request Jun 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/execution SIG execution sig/sql-infra SIG: SQL Infra size/L Denotes a PR that changes 100-499 lines, ignoring generated files. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants