sql: foreign key batch checking per statement #26786

emsal0 · 2018-06-18T14:29:42Z

Currently, foreign key checks are run for every row being inserted. We could make optimizations by making foreign key checks at the end of each statement, batching together all of the checks corresponding to a single statement.

@BramGruneir @knz @jordanlewis

"if you insert 10 rows in a single statement, 8 of which have the same value for a column that has a foreign key relationship to another table, you shouldn’t have to check that relationship 8 times"

Benchmarks need to be made that test the performance of foreign key checks for multi-row statements and other statements that deal with foreign key checks
The code change itself: change the way that some functions in fk.go do the foreign key checks, and possibly how those functions are called
Confirm that the code change actually results in better performance on the benchmark

The text was updated successfully, but these errors were encountered:

jordanlewis · 2018-06-18T16:44:45Z

See #15157 if you haven't already, which has some prior discussion about this issue as well as some potential other strategies for dealing with it.

BramGruneir · 2018-06-18T17:07:16Z

One thing to note, is that when batching together for each statement, make sure that moving them to batching per transaction or even reverting to per row is possible. That way deferrable can be implemented afterwards.

tbg · 2018-06-18T18:58:10Z

Oh, I missed this issue when creating #26795. Feel free to incorporate what you like into this one and close it.

knz · 2018-06-18T19:46:46Z

Please be mindful in these discussions that there may be too much work to defer: accumulating the work to be done may not fit in RAM. That's why we have maximum batch sizes in many places. As a matter of methodology any proposal should model memory usage as a function of the number of rows processed. There should be an upper bound that's independent from the number of rows. (There is an exception which we may want to support later but **not in the general case and certainly not to enable optimizations**: the case where the client *mandates* deferred key checks in SQL. For those cases we must do clever buffering of the work, using a mix of RAM and disk storage so that RAM usage is properly bounded at all times. But beware that supporting *mandatory* deferred checks is fundamentally a different project so don't let this avenue of thought distract you / lose yourself on a tangent) Tobias Schottdorf <[email protected]> schreef op 18 juni 2018 14:58:33 GMT-04:00:

…

Oh, I missed this issue when creating #26795. Feel free to incorporate what you like into this one and close it. -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #26786 (comment)

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

knz · 2018-06-18T19:53:32Z

Bram my previous response was directed to you. I think: 1) it's not a good idea to explore inter-statement optimizations at this time (it's more complex and we must ensure the first next step is tractable) 2) it's semantically incorrect to mix deferred FK checks mandated by the SQL client with optimization-driven batching of FK checks per statement that requires (in the SQL language) non-deferred semantics. The proof of why this is true is left as an exercise to the reader but I can provide some guidance upon request. -- Sent from my Android device with K-9 Mail. Please excuse my brevity.

BramGruneir · 2018-06-18T20:40:08Z

@knz, I agree.

And of course we need to be cognizant of the missing functionality. Regardless, we need to ensure that we never run out of memory.

BramGruneir · 2018-09-11T18:33:43Z

The work on this did not make it into 2.1, but should be done in the 2.2 time frame.

nvanbenschoten · 2019-03-29T05:07:17Z

I just spent some time looking at traces from new_order transactions in TPC-C. On a cluster with a low amount of load, the transaction takes around 31ms. The traces revealed that, on average, about 9ms of the transaction are spent performing redundant foreign key lookups that could be eliminated if we collapsed foreign key checks across rows in the same statement. Put another way, 9 of the 23 kv batches issued by the transaction were superfluous and could be completely avoided by addressing this issue. That's an estimated 27% savings on the most common txn in TPC-C.

knz · 2019-03-29T07:08:36Z

Please send this to Jordan and Andy to further motivate prioritization in 19.2 Nathan VanBenschoten <[email protected]> schreef op 29 maart 2019 06:07:47 CET:

…

I just spent some time looking at traces from `new_order` transactions in TPC-C. On a cluster with a low amount of load, the transaction takes around `31ms`. The traces revealed that, on average, about `9ms` of the transaction are spent performing redundant foreign key lookups that could be eliminated if we collapsed foreign key checks across rows in the same statement. Put another way, 9 of the 23 kv batches issued by the transaction were superfluous and could be completely avoided by addressing this issue. That's an estimated 27% savings on the most common txn in TPC-C. -- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: #26786 (comment)

-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

knz · 2019-04-01T16:20:48Z

cc @awoods187 please refer to nathan's comment above

awoods187 · 2019-04-01T16:48:43Z

It's on the roadmap

knz · 2019-04-01T17:16:18Z

@awoods187 I was more thinking about copy-pasting his explanation into airtable, given that the initial motivation was one of correctness and use cases, and here we have a new argument that's performance-oriented.

jordanlewis · 2019-07-31T03:17:37Z

This will get done as part of the optimizer's work on foreign key planning, so I'm moving to planning project for tracking.

RaduBerinde · 2019-12-21T19:55:39Z

Opt-driven FK checks are now enabled on master.

emsal0 self-assigned this Jun 18, 2018

knz added C-performance Perf of queries or internals. Solution not expected to change functional behavior. A-sql-mutations Mutation statements: UPDATE/INSERT/UPSERT/DELETE. labels Jul 9, 2018

knz added this to the 2.1 milestone Jul 23, 2018

BramGruneir assigned BramGruneir and unassigned emsal0 Sep 11, 2018

BramGruneir modified the milestones: 2.1, 2.2 Sep 11, 2018

petermattis removed this from the 2.2 milestone Oct 5, 2018

nvanbenschoten mentioned this issue Mar 19, 2019

roachtest: tpcc/nodes=3/w=max failed with foreign key violation #35812

Closed

jordanlewis added the A-sql-fks label Apr 24, 2019

jordanlewis assigned RaduBerinde and unassigned BramGruneir Jul 31, 2019

RaduBerinde closed this as completed Dec 21, 2019

jseldess mentioned this issue Jan 5, 2020

Improve FK Performance and Compatibility cockroachdb/docs#5967

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: foreign key batch checking per statement #26786

sql: foreign key batch checking per statement #26786

emsal0 commented Jun 18, 2018

jordanlewis commented Jun 18, 2018

BramGruneir commented Jun 18, 2018

tbg commented Jun 18, 2018

knz commented Jun 18, 2018 via email

knz commented Jun 18, 2018 via email

BramGruneir commented Jun 18, 2018

BramGruneir commented Sep 11, 2018

nvanbenschoten commented Mar 29, 2019

knz commented Mar 29, 2019 via email

knz commented Apr 1, 2019

awoods187 commented Apr 1, 2019

knz commented Apr 1, 2019

jordanlewis commented Jul 31, 2019

RaduBerinde commented Dec 21, 2019

sql: foreign key batch checking per statement #26786

sql: foreign key batch checking per statement #26786

Comments

emsal0 commented Jun 18, 2018

jordanlewis commented Jun 18, 2018

BramGruneir commented Jun 18, 2018

tbg commented Jun 18, 2018

knz commented Jun 18, 2018 via email

knz commented Jun 18, 2018 via email

BramGruneir commented Jun 18, 2018

BramGruneir commented Sep 11, 2018

nvanbenschoten commented Mar 29, 2019

knz commented Mar 29, 2019 via email

knz commented Apr 1, 2019

awoods187 commented Apr 1, 2019

knz commented Apr 1, 2019

jordanlewis commented Jul 31, 2019

RaduBerinde commented Dec 21, 2019