Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the absolute latency of TP statements #18006

Open
2 tasks
scsldb opened this issue Jun 15, 2020 · 1 comment
Open
2 tasks

Reduce the absolute latency of TP statements #18006

scsldb opened this issue Jun 15, 2020 · 1 comment
Assignees
Labels
feature/accepted This feature request is accepted by product managers priority/P0 The issue has P0 priority. type/feature-request Categorizes issue or PR as related to a new feature.
Milestone

Comments

@scsldb
Copy link

scsldb commented Jun 15, 2020

Feature Request

Description

In some mission-critical scenarios, there are dozens of SQL statements in a transaction. And there is a clear requirement for the max latency of these transactions, for example, must under 100ms.

So we should reduce the absolute latency of these CRUD operations.

Category

Performance

Value

Provide high-quality service

Task list

  • P0: Support async commit #18220 async commit
    In TiDB we use 2PC to implement the distributed transactions. In 2PC, the first stage is prewrite, and the second stage is commit. We can see the writing statement(INSERT/UPDATE/DELETE) must wait for these two stages finished before return to the client. Async commit means the writing statement can return to the client ASAP the prewrite stage finished, and there are some mechanisms to guarantee the atomic of the distributed transaction. From the client's aspect, the absolute latency of the writing statements is reduced nearly by half.

  • P0: Support Clustered Index #4841 cluster index
    Currently, TiDB arranges table's row data by handle id, for example we have a table whose schema is

CRATE TABLE t (
  uid varchar(64),
  s int(10),
  data varchar(255),
  primary key (uid, s),
)

There will be two key-value pairs for each row data:

  • handle id -> (uid, s, data)
  • (uid, s) -> handle id

The handle id is a uint64 generated by TiDB. There are some disadvantages for this type of data arrangement:

  • we need to write two key-value pairs for each row data
  • when reading the data by primary key, TiDB will trigger two read operations, one is to fetch the related handle id by primary key, and the other is to fetch the data by the handle id. This will increase the latency of SELECT statements.

In the clustered index, we arrange row data by primary key, in the above example, there will be only one key-value pair for each row data, this will the disadvantages mentioned above:

  • (uid, s) -> (data)

Workload estimation

500

Time

GanttStart: 2020-06-15
GanttDue: 2020-08-30
GanttProgress: 95%

@scsldb scsldb added type/feature-request Categorizes issue or PR as related to a new feature. priority/P0 The issue has P0 priority. labels Jun 15, 2020
@scsldb scsldb added this to the v5.0-alpha milestone Jun 15, 2020
@zhangjinpeng87
Copy link
Contributor

/assign

@scsldb scsldb modified the milestones: v5.0.0-alpha, v5.0.0-beta.1 Jul 15, 2020
@scsldb scsldb added the feature/accepted This feature request is accepted by product managers label Jul 16, 2020
@jebter jebter modified the milestones: v5.0.0-alpha, v5.0.0-rc Jan 7, 2021
@jebter jebter modified the milestones: v5.0.0-rc, v5.0.0 Jan 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature/accepted This feature request is accepted by product managers priority/P0 The issue has P0 priority. type/feature-request Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

3 participants