From dd3d9eeda78425342d4481fd711ab36a7543ac72 Mon Sep 17 00:00:00 2001 From: Rebecca Taft Date: Thu, 26 Oct 2017 15:53:35 -0400 Subject: [PATCH] rfc: SELECT FOR UPDATE RFC regarding support of SELECT ... FOR UPDATE SQL syntax. See Issue #6583. --- docs/RFCS/20171024_select_for_update.md | 286 ++++++++++++++++++++++++ 1 file changed, 286 insertions(+) create mode 100644 docs/RFCS/20171024_select_for_update.md diff --git a/docs/RFCS/20171024_select_for_update.md b/docs/RFCS/20171024_select_for_update.md new file mode 100644 index 000000000000..5ef8c2031e93 --- /dev/null +++ b/docs/RFCS/20171024_select_for_update.md @@ -0,0 +1,286 @@ +- Feature Name: Support `SELECT FOR UPDATE` +- Status: draft +- Start Date: 2017-10-17 +- Authors: Rebecca Taft +- RFC PR: (PR # after acceptance of initial draft) +- Cockroach Issue: [#6583](https://github.com/cockroachdb/cockroach/issues/6583) + +# Summary + +Support the `SELECT ... FOR UPDATE` SQL syntax, which locks rows returned by the `SELECT` +statement. This pessimistic locking feature prevents concurrent transactions from updating +any of the locked rows until the locking transaction commits or aborts. Several potential customers +have asked for this feature, and it would also get us closer to feature parity with Postgres. +The easiest way to implement this in CockroachDB would be by setting row-level "dummy" intents, +but another option would be to set span-level locks. + +# Motivation + +As described in [this issue](https://github.com/cockroachdb/cockroach/issues/6583), +`SELECT ... FOR UPDATE` is not standard SQL, but many databases now support it, including +Postgres. Several third party products such as the [Quartz Scheduler](http://www.quartz-scheduler.org), +[OpenJPA](http://openjpa.apache.org) and [Liquibase](http://www.liquibase.org) +also rely on this feature, preventing some potential customers from switching to CockroachDB. + +In some cases, `SELECT ... FOR UPDATE` is required to maintain correctness when running CockroachDB +in `SNAPSHOT` mode. In particular, `SELECT ... FOR UPDATE` can be used to prevent write skew anomalies. +Write skew anomalies occur when two concurrent transactions read an overlapping set of +rows but update disjoint sets of rows. Since the transactions each operate on private snapshots of +the database, neither one will see the updates from the other. + +The [Wikipedia entry on Snapshot Isolation](https://en.wikipedia.org/wiki/Snapshot_isolation) has a +useful concrete example: + +> ... imagine V1 and V2 are two balances held by a single person, Phil. The bank will allow either +V1 or V2 to run a deficit, provided the total held in both is never negative (i.e. V1 + V2 ≥ 0). +Both balances are currently $100. Phil initiates two transactions concurrently, T1 withdrawing $200 +from V1, and T2 withdrawing $200 from V2. .... T1 and T2 operate on private snapshots of the database: +each deducts $200 from an account, and then verifies that the new total is zero, using the other account +value that held when the snapshot was taken. Since neither update conflicts, both commit successfully, +leaving V1 = V2 = -$100, and V1 + V2 = -$200. + +`SELECT ... FOR UPDATE` is not needed for correctness when running in `SERIALIZABLE` mode, +but it may still be useful for controlling lock ordering and avoiding deadlocks. For example, +consider the following schedule: + +``` +T1: Starts transaction +T2: Starts transaction +T1: Updates row A +T2: Updates row B +T1: Wants to update row B (blocks) +T2: Wants to update row A (deadlock) +``` + +This sort of scenario can happen in any database that tries to maintain some level of correctness. +It is especially common in databases that use pessimistic two-phased locking (2PL) since transactions +must acquire shared locks for reads in addition to exclusive locks for writes. But deadlocks like the +one shown above also happen in databases that use MVCC like PostgreSQL and CockroachDB, since writes must +acquire locks on all rows that will be updated. Postgres and many other systems detect deadlocks by +identifying cycles in a "waits-for" graph, where nodes represent transactions, and directed edges represent +transactions waiting on each other to release locks. If a cycle (deadlock) is detected, transactions will be +selectively aborted until the cycle(s) are removed. Some other systems use a timeout mechanism, where +transactions will abort after waiting a certain amount of time to aquire a lock. In either case, the +deadlock causes delays and aborted transactions. + +`SELECT ... FOR UPDATE` will help avoid deadlocks by allowing transactions to aqcuire all of their locks +up front. For example, the above schedule would change to the following: + +``` +T1: Starts transaction +T2: Starts transaction +T1: Locks rows A and B +T1: Updates row A +T2: Wants to update row B (blocks) +T1: Updates row B +T1: Commits +T2: Updates row B +T2: Updates row A +T2: Commits +``` + +Since `T1` locked rows A and B at the start of the transaction, the deadlock was prevented. +`SELECT ... FOR UPDATE` won't eliminate deadlocks, but it will make them less likely. + +Many implementations of this feature also include options to control whether or not to wait on locks. +`SELECT ... FOR UPDATE NOWAIT` is one option, which causes the query to return an error if it is +unable to immediately lock all target rows. This is useful for latency-critical situations, +and could also be useful for auto-retrying transactions in CockroachDB. `SELECT ... FOR UPDATE SKIP LOCKED` +is another option, which returns only the rows that could be locked immediately, and skips over the others. +This option returns an inconsistent view of the data, but may be useful for cases when multiple +workers are trying to process data in the same table as if it were a queue of tasks. +The default behavior of `SELECT ... FOR UPDATE` is for the transaction to block if some of the +target rows are already locked by another transaction. Note that it is not possible to use the +`NOWAIT` and `SKIP LOCKED` modifiers without `FOR { UPDATE | SHARE | ... }`. + +The first implementation of `FOR UPDATE` in CockroachDB will not include `NOWAIT` or `SKIP LOCKED` options. +It seems that some users want these features, but many would be satisfied with `FOR UPDATE` alone. +This proposal will implement the semantics that users expect for `FOR UPDATE`: `SELECT ... FOR UPDATE` will lock +all of the rows specified by the `SELECT` with exclusive locks until the transaction is committed or aborted. +There will be cases when additional rows are locked as well due to the mechanism for setting row-level +intents in CockroachDB, but that will not change the semantics. It could affect performance, though, if +other transactions are blocked due to the additional locks. + +# Guide-level explanation + +The [Postgres Documentation](https://www.postgresql.org/docs/current/static/sql-select.html#sql-for-update-share) +describes this feature as it is supported by Postgres. As shown, the syntax of the locking clause has the form + +``` +FOR lock_strength [ OF table_name [, ...] ] [ NOWAIT | SKIP LOCKED ] +``` + +where `lock_strength` can be one of + +``` +UPDATE +NO KEY UPDATE +SHARE +KEY SHARE +``` + +For our initial implementation in CockroachDB, we will likely simplify this syntax to + +``` +FOR UPDATE +``` + +i.e., no variation in locking strength, no specified tables, and no options for avoiding +waiting on locks. Using `FOR UPDATE` will result in locking the rows touched by the `SELECT` query +with exclusive locks. As described above, this feature alone is useful because it helps +maintain correctness when running CockroachDB in `SNAPSHOT` mode (avoiding write skew), and serves +as a tool for optimization (avoiding deadlocks) when running in `SERIALIZABLE` mode. + +For example, consider the following transaction: + +``` +BEGIN; + +SELECT * FROM employees WHERE name = 'John Smith' FOR UPDATE; + +... + +UPDATE employees SET salary = 50000 WHERE name = 'John Smith'; + +COMMIT; +``` + +will lock the rows of all employees named John Smith at the beginning of the transaction. +In the context of CockroachDB, "lock" corresponds to setting a "write intent" on a row, +preventing other concurrent transactions from simultaneously updating that row. +As a result, the `UPDATE employees ...` statement at the end of the transaction will not need +to acquire any additional locks (i.e., will not need to set additional row-level write intents). +Note that `FOR UPDATE` will have no effect if it is used in a stand-alone query that is not +part of any transaction. + +In contrast to the above example, if the first `SELECT` statement in the transaction is +`SELECT * FROM employees WHERE name like '%Smith' FOR UPDATE;`, CockroachDB will lock +all of the rows in the `employees` table because it's not possible to determine from the predicate +which key spans are affected. This lack of precision will be an issue for any predicate that +does not directly translate to particular key spans. This is a departure from Postgres, since Postgres +generally locks exactly the rows returned by the query. + +# Reference-level explanation + +This section provides more detail about how and why the CockroachDB implementation of +the locking clause will differ from Postgres. + +With the current model of CockroachDB, it is not possible to support the locking strengths +`NO KEY UPDATE` or `KEY SHARE` because +these options require locking at a sub-row granularity. It is also not clear that CockroachDB can support +`SHARE`, because there is currently no such thing as a "read intent". `UPDATE` can be supported by +marking the affected rows with dummy write intents. + +By default, if `FOR UPDATE` is used in Postgres without specifying tables +(without the `OF table_name [, ...]` clause), +Postgres will lock all rows returned by the `SELECT` query. The `OF table_name [, ...]` clause +enables locking only the rows in the specified tables. To lock different tables with different +strengths or different options, Postgres users can string multiple locking clauses together. +For example, + +``` +SELECT * from employees e, departments d, companies c +WHERE e.did = d.id AND d.cid = c.id +AND c.name = `Cockroach Labs` +FOR UPDATE OF employees SKIP LOCKED +FOR SHARE OF departments NOWAIT +``` +locks rows in the `employees` table that satisfy the join condition with an exclusive lock, +and skips over rows that are already locked by another transaction. +It also locks rows in the `departments` table that satisfy the join condition with a shared lock, +and returns an error if it cannot lock all of the rows immediately. It does not lock +the `companies` table. + +Implementing this flexibility in CockroachDB for use of different tables and different options +may be excessively complicated, and it's not clear that our customers actually need +it. To avoid spending too much time on this, as mentioned above, we will probably just implement the most +basic functionality in which clients use `FOR UPDATE` to lock the rows touched by the query. +Initially we won't include the `SKIP LOCKED` or `NOWAIT` options, but it may be worth implementing +these at some point. + +At the moment it is not possible to use `FOR UPDATE` in views (there will not be an error, but it will +be ignored). This is similar to the way `ORDER BY` and `LIMIT` are handled in views. See comment from +@a-robinson in [data_source.go:getViewPlan()](https://github.com/cockroachdb/cockroach/blob/5a6b4312a972b74b0af5de53dfdfb204dc0fd6d7/pkg/sql/data_source.go#L680). If `ORDER BY` and `LIMIT` are supported later, `FOR UPDATE` would come for free. +Postgres supports all of these options in views, since it supports any `SELECT` query, and re-runs +the query each time the view is used. + +Another potential difference between the Postgres and CockroachDB implementations +relates to the precision of locking. In general, Postgres locks exactly the rows returned +by the query, and no more. There are a few examples given in the +[documentation](https://www.postgresql.org/docs/current/static/sql-select.html#sql-for-update-share) +where that's not the case. For example, `SELECT ... LIMIT 5 OFFSET 5 FOR UPDATE` may +lock up to 10 rows even though only 5 rows are returned. It may be more difficult to +be precise with locking in CockroachDB, since locking happens at the KV layer, and +predicates may be applied later. This should not affect correctness in terms of consistency or +isolation, but could affect performance if there is high contention. As described above, +when running in `SNAPSHOT` mode, `SELECT ... FOR UDPATE` will prevent write skew anomalies, +and in `SERIALIZABLE` mode, `SELECT ... FOR UDPATE` can help prevent deadlocks. These semantics +are unchanged if extra rows are locked. + +## Detailed design + +There are a number of changes that will need to be implemented in order to +support `FOR UPDATE`. + +- Update the parser to support the syntax in `SELECT` statements. +- Update the KV API to include new messages ScanForUpdate and ReverseScanForUpdate +- Update the KV and storage layers to mimic processing of Scan and ReverseScan + and set dummy write intents on every row touched. + +As described by Ben in [issue #6583](https://github.com/cockroachdb/cockroach/issues/6583), +`git grep -i reversescan` provides an idea of the scope of the change. The bulk of the changes +would consist of implementing new ScanForUpdate and ReverseScanForUpdate calls in the KV API. +These would work similarly to regular scans, but A) would be flagged as read/write commands +instead of read-only and B) after performing the scan, they'd use MVCCPut to write back the +values that were just read (that's not the most efficient way to do things, but Ben thinks it's +the right way to start since it will have the right semantics without complicating the +backwards-compatibility story). Then the SQL layer would use these instead of plan +Scan/ReverseScan when FOR UPDATE has been requested. + +There was some discussion about whether we really needed new API calls, but Ben pointed out +that making it possible to write on `Scan` requests would make debugging a nightmare. + +## Drawbacks + +It seems that there is sufficient demand for some form of the `FOR UPDATE` syntax. However, +there are pros and cons to implementing or not implementing certain features. + +- It will be a lot of work to implement all of the features supported by Postgres. + This is probably not worth our time since it's not clear these features will actually get used. + We can easily add them later if needed. +- If we use the approach of setting intents on every row touched by the `SELECT`, it + could hurt performance if we are selecting a large range. But setting intents on every + row will be easier to implement than the alternatives discussed below. + +## Rationale and Alternatives + +The proposed solution is to lock rows by writing dummy write intents on each row. +However, another alternative is to lock an entire Range if the `SELECT` statement +would return the majority of rows in the range. This is similar to the approach +suggested by the [Revert Command RFC](https://github.com/cockroachdb/cockroach/pull/16294). + +The advantage of locking entire ranges is that it would significantly improve the performance +for large scans compared to setting intents on individual rows. The downside is that +this feature is not yet implemented, so it would be significantly more effort than using +simple row-level intents. It's also not clear that customers would use `FOR UPDATE` with +large ranges, so this may be an unneeded performance optimization. Furthermore, +locking the entire range based on the predicate could result in locking rows that should +not be observable by the `SELECT`. For instance, if the transaction performing the +`SELECT FOR UPDATE` query over some range is at a lower timestamp than a later `INSERT` +within that range, the `FOR UPDATE` lock should not apply to the newly written row. +This issue is probably not any worse than the other problems with locking precision described above, +though. + +One advantage of implementing range-level locking is that we could reuse this feature +for other applications such as point-in-time recovery. The details of the proposed implementation +as well as other possible applications are described in the [Revert Command RFC](https://github.com/cockroachdb/cockroach/pull/16294). +However, in the interest of getting something working sooner rather than later, +I believe row-level intents make more sense at this time. + +## Unresolved questions + +- Should we move forward with implementing only `FOR UPDATE` at first, + or are some of the other features essential as well? +- Should we stick with row-level intents or implement range-level locking? +- Should we be concerned about locking precision (i.e., lock exactly the + rows returned by query and no more)?