Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distsql: TxnCoordSender doesn't allow distribution if transaction changed data #13376

Closed
RaduBerinde opened this issue Feb 2, 2017 · 8 comments
Assignees
Milestone

Comments

@RaduBerinde
Copy link
Member

The TxnCoordSender assumes everything happens on a node, except for read-only transactions. DistSQL assumes that read-only queries can be distributed. The problem is in the difference between "transaction" and "query": if the transaction earlier modified data, a read-only query in that transaction apparently can't be distributed - i saw the error below while running TestLogicDistSQL on a 3 node cluster with the fake span resolver.

statement ok
BEGIN TRANSACTION

statement ok
CREATE TABLE kv (
  k CHAR PRIMARY KEY,
  v TIMESTAMP WITH TIME ZONE
)

statement ok
INSERT INTO kv (k,v) VALUES ('a', transaction_timestamp())

query T
SELECT k FROM kv      <----------
E170202 12:48:14.328709 17874 sql/distsqlrun/tablereader.go:159  [n3,fe9cf6117,TableReader=53] scan error: writing transaction timed out or ran on multiple coordinators
E170202 12:48:14.329505 17855 sql/distsqlrun/server.go:181  [n1] writing transaction timed out or ran on multiple coordinators

The select was simply running the scan on another node:
image

@tamird
Copy link
Contributor

tamird commented Feb 2, 2017 via email

@RaduBerinde
Copy link
Member Author

Thanks, I meant to reference that issue but I forgot.

@bdarnell
Copy link
Contributor

bdarnell commented Feb 2, 2017

The fact that read-only transactions can be distributed like this is an accident. If we decide it's a feature we want to support (and I don't see why we wouldn't), we could simply skip the "multiple coordinator" check for read-only batches. No need for anything as intrusive as #10511.

@RaduBerinde
Copy link
Member Author

It used to be like that; the check was made more strict to handle a subtle condition, details in 5fec899

RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Feb 2, 2017
RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Feb 7, 2017
@RaduBerinde
Copy link
Member Author

The scenario addressed by the "multiple coordinator" check (5fec899) is:

  • we have a transaction in which we wrote something
  • the txn gets abandoned (timed out)
  • a subsequent read-only query in that txn reads something that we wrote earlier

The read at the end should error out; without the check it would just run and it won't see anything that was written earlier in that txn (traces of that txn are gone).

One way to check for this wile still allowing distribution of the read-only query is to move up the check into sql. If the txn has an anchor key, the sql layer could call into the TxnCoordSender to verify that it still has the txn on file. It can check this after every statement it runs (when it still has a chance to return an error for the statement).

Note that this may become problematic if we ever stream results directly to the client (we would stream incorrect results before the error comes).

@andreimatei
Copy link
Contributor

FWIW this change that introduced the check in TxnCoordSender was merged last March, a few weeks before the AbortCache was introduced to prevent similar problems (in 9fea9f7b0b3183987d32fe93548560460a1b8cb5).

@bdarnell
Copy link
Contributor

bdarnell commented Feb 8, 2017

The abort cache (like the sequence cache that did the same thing and preceded 5fec899) cannot (reliably) help with timed-out transactions, because the abort cache entry becomes eligible for GC at the same time the transaction times out.

So to reliably avoid the anomaly fixed in #5323, we need to either lift the transaction liveness check up to the sql/distsql level like @RaduBerinde suggested, or push it down to each node that handles part of the transaction (basically, allow multiple coordinators as long as only one writes, but each coordinator would be responsible for heartbeating or polling the transaction status).

RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Feb 8, 2017
@cuongdo cuongdo added this to the Q2 2017 milestone Feb 22, 2017
@dianasaur323 dianasaur323 modified the milestones: 1.1, Q2 2017 Apr 20, 2017
@andreimatei
Copy link
Contributor

Note to self: currently the "writing transaction timed out or ran on multiple coordinators" check is in the wrong place. It's done in TxnCoordSender.Send, on the request's way out to the DistSender. But it should (also?) be done on the way in. We want to also check that the transaction has not been abandoned while the respective read was in progress.

@andreimatei andreimatei modified the milestones: 1.2, 1.1 Aug 11, 2017
spencerkimball added a commit to spencerkimball/cockroach that referenced this issue Jan 15, 2018
…ashion

The key idea here is to stop using a single TxnCoordSender but instead
provide a factory for stateful instances of TxnCoordSender to be used
with each client.DB transaction. The state stored by the TxnCoordSender
can be fetched, collected, passed, and then used to augment a parent
or sibling TxnCoordSender.

This allows DistSQL to use the TxnCoordSender just like normal SQL and
paves the way for mutating distributed SQL transactions. With this change,
we get the added benefit of observed timestamps being correct returned
to the root transaction through distributed sql flows, which should
prevent unnecessary restarts from uncertainty interval errors.

Fixes cockroachdb#10511
Fixes cockroachdb#13376

Release note: None
spencerkimball added a commit to spencerkimball/cockroach that referenced this issue Jan 16, 2018
…ashion

The key idea here is to stop using a single TxnCoordSender but instead
provide a factory for stateful instances of TxnCoordSender to be used
with each client.DB transaction. The state stored by the TxnCoordSender
can be fetched, collected, passed, and then used to augment a parent
or sibling TxnCoordSender.

This allows DistSQL to use the TxnCoordSender just like normal SQL and
paves the way for mutating distributed SQL transactions. With this change,
we get the added benefit of observed timestamps being correct returned
to the root transaction through distributed sql flows, which should
prevent unnecessary restarts from uncertainty interval errors.

Fixes cockroachdb#10511
Fixes cockroachdb#13376

Release note (dist sql): Removes the limitation that distributed SQL
is not used once a transaction has had writes.
spencerkimball added a commit to spencerkimball/cockroach that referenced this issue Jan 16, 2018
…ashion

The key idea here is to stop using a single TxnCoordSender but instead
provide a factory for stateful instances of TxnCoordSender to be used
with each client.DB transaction. The state stored by the TxnCoordSender
can be fetched, collected, passed, and then used to augment a parent
or sibling TxnCoordSender.

This allows DistSQL to use the TxnCoordSender just like normal SQL and
paves the way for mutating distributed SQL transactions. With this change,
we get the added benefit of observed timestamps being correct returned
to the root transaction through distributed sql flows, which should
prevent unnecessary restarts from uncertainty interval errors.

Fixes cockroachdb#10511
Fixes cockroachdb#13376

Release note (dist sql): Removes the limitation that distributed SQL
is not used once a transaction has had writes.
spencerkimball added a commit to spencerkimball/cockroach that referenced this issue Jan 20, 2018
…ashion

The key idea here is to stop using a single TxnCoordSender but instead
provide a factory for stateful instances of TxnCoordSender to be used
with each client.DB transaction. The state stored by the TxnCoordSender
can be fetched, collected, passed, and then used to augment a parent
or sibling TxnCoordSender.

This allows DistSQL to use the TxnCoordSender just like normal SQL and
paves the way for mutating distributed SQL transactions. With this change,
we get the added benefit of observed timestamps being correct returned
to the root transaction through distributed sql flows, which should
prevent unnecessary restarts from uncertainty interval errors.

Fixes cockroachdb#10511
Fixes cockroachdb#13376

Release note (dist sql): Removes the limitation that distributed SQL
is not used once a transaction has had writes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants