Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: introduce connExecutor, the query execution orchestrator #22277

Conversation

andreimatei
Copy link
Contributor

Release note: None

The connExecutor is the top dog that interfaces with a pgwire connection
(through the clientComm and CommandResult interfaces), consumes a stream
of queries and produces a stream of results. It encapsulates the
connection state machine for which it produces events. It interfaces
with the two SQL execution engines for actually running queries. Its
main responsibility is to dispatch statements based on the current state
(in txn, not in txn, in aborted txn, etc) and to handle execution in all
states but Open (in Open it talks to an execution engine). It also
handles other commands than executiong queries: preparing, binding, etc
and it maintains the session state associated with prepares statements.

@andreimatei andreimatei requested review from a team January 31, 2018 22:08
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@andreimatei
Copy link
Contributor Author

First commit is #21900

This PR will eventually contain some more things:

  • right now, "commands" other than executing queries are not implemented (i.e. everything around preparing statements)
  • the pgwire code for handling result commands is missing

But still, I think that getting some review would be very useful. One can start from connExecutor.run() and drill down, or come sit with me at leisure. Thanks!

@andreimatei andreimatei force-pushed the executor-state-machine-conn-executor branch 2 times, most recently from 4634524 to b4ec5f2 Compare February 1, 2018 22:44
@knz
Copy link
Contributor

knz commented Feb 2, 2018

Ok so I have reviewed this. This is mostly good, and I have lots of respect for the herculean effort that went into this.

Yet I also think the overall structure of the patch does not live up to our usual engineering standards.

Before we go further with the review I recommend you merge #21900 and rebase on top of that. I am comfortable with the first commit.

Then for the remainder I notice at least 3 unrelated changes:

  • the new hidden field in the query struct
  • changes to the txn FSM
  • introducing the connection executor

I recommend you split the change into 3 commits, in that order. In particular I'd like to see the FSM changes isolated from the rest, so I can check what's going on in there. Possibly the first two can be handled in another PR (prefix to the last, largish change)

Finally, there are too many "WIPs" and "panics" remaining in the code. If you think the change is too overwhelming to make a single commit/PR for, we can talk it through.

Thanks, you're just the best.


Reviewed 18 of 18 files at r1, 22 of 22 files at r2.
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks failed.


pkg/sql/conn_executor.go, line 187 at r2 (raw file):

// (resultWithStoredErr), which provides access to the query error for purposes
// of building the correct state machine event.
//

Super comment. 👍 🥇


pkg/sql/conn_executor.go, line 224 at r2 (raw file):

	// pool is the parent monitor for all session monitors
	// WIP(andrei): this needs to be pgwire.Server.sqlMemoryPool.

We're not merging this into master until you connect this, right?


pkg/sql/conn_executor.go, line 246 at r2 (raw file):

// is used.
func NewServer(cfg *ExecutorConfig) *Server {
	// WIP(andrei): take the stopper and listen to Gossip like the Executor does.

So if we merge this now, we can't shut down a node that has active SQL connections any more?


pkg/sql/conn_executor.go, line 311 at r2 (raw file):

	memMetrics *MemoryMetrics,
) *connExecutor {

stray empty line


pkg/sql/conn_executor.go, line 398 at r2 (raw file):

}

// WIP(andrei)

wup?


pkg/sql/conn_executor.go, line 1519 at r2 (raw file):

//
// TODO(nvanbenschoten): We do not currently support parallelizing distributed SQL
// queries, so this method can only be used with classical SQL.

s/classical/local


pkg/sql/conn_executor.go, line 1712 at r2 (raw file):

	default:
		// WIP(andrei): The Executor is ignoring other stmt types. Was calling
		// startPlan() sufficient for them?

I believe so.


pkg/sql/conn_executor.go, line 1766 at r2 (raw file):

	stmt Statement, res RestrictedCommandResult,
) (fsm.Event, fsm.EventPayload) {
	panic("!!! unimplemented")

What's up with this?


pkg/sql/results_writer.go, line 319 at r2 (raw file):

// CloseResultWithError implements the StatementResult interface.
func (b *bufferedWriter) CloseResultWithError(err error) error {
	panic("!!! unimplemented")

??


pkg/sql/session.go, line 134 at r2 (raw file):

	// If set, this query will not be reported as part of SHOW QUERIES.
	hidden bool

What's up with that? It needs to be mentioned+explained in the commit message.


pkg/sql/pgwire/v3.go, line 1114 at r2 (raw file):

// CloseResultWithError implements the StatementResult interface.
func (c *v3Conn) CloseResultWithError(err error) error {
	panic("!!! not implemented")

What's this?


Comments from Reviewable

@andreimatei andreimatei force-pushed the executor-state-machine-conn-executor branch 6 times, most recently from 7bf09ee to dc4947d Compare February 4, 2018 03:07
@andreimatei andreimatei force-pushed the executor-state-machine-conn-executor branch 2 times, most recently from 2666ded to 35e91c0 Compare February 4, 2018 20:17
@andreimatei
Copy link
Contributor Author

I've extracted the conn_fsm changes in another PR (all but the last commit are elsewhere). I've explained what's going on with that hidden field in a comment on the source.

I've added code for dealing with prepared statements and portals and all that jazz. If you could give that a look, that'd be fantastic (diffing in reviewable since what you've seen last is probably the way to go). Thanks Rafa!


Review status: 1 of 34 files reviewed at latest revision, 10 unresolved discussions.


pkg/sql/conn_executor.go, line 224 at r2 (raw file):

Previously, knz (kena) wrote…

We're not merging this into master until you connect this, right?

correct


pkg/sql/conn_executor.go, line 246 at r2 (raw file):

Previously, knz (kena) wrote…

So if we merge this now, we can't shut down a node that has active SQL connections any more?

if we merge this now, nobody's creating one of these Servers. In any case, this one had actually been done.


pkg/sql/conn_executor.go, line 311 at r2 (raw file):

Previously, knz (kena) wrote…

stray empty line

Done.


pkg/sql/conn_executor.go, line 398 at r2 (raw file):

Previously, knz (kena) wrote…

wup?

nobody's using the Server yet


pkg/sql/conn_executor.go, line 1519 at r2 (raw file):

Previously, knz (kena) wrote…

s/classical/local

Done.


pkg/sql/conn_executor.go, line 1712 at r2 (raw file):

Previously, knz (kena) wrote…

I believe so.

done, thanks


pkg/sql/conn_executor.go, line 1766 at r2 (raw file):

Previously, knz (kena) wrote…

What's up with this?

will come


pkg/sql/results_writer.go, line 319 at r2 (raw file):

Previously, knz (kena) wrote…

??

no particular need to call out things with "!!!" and "wip(andrei)". I'm aware of them :)
This PR is not ready to merge.


pkg/sql/session.go, line 134 at r2 (raw file):

Previously, knz (kena) wrote…

What's up with that? It needs to be mentioned+explained in the commit message.

I don't think there's much to be mentioned particularly in the commit message, but I'm happy to explain here. This is a case of something working differently in the new world than the old world.
This field itself is a minor thing; I can remove it if you like. Here's the broader story:
In the current world, a Session maintains a list of "active queries"; queries implementing tree.HiddenFromShowQueries are not present in that collection. Maintaining the collection was done in really contorted ways: there was a path for removing queries from it that ran through the parallel queue different from the path for regular queries. The queries that are not run through the parallel queue were duplicate in two collections for extra head scratching. To top it off, the moment when a query was no longer reported didn't take into account whether its results had been delivered to the client or not. It was just all very ugly (IMHO).

In the new world, the connExecutor also maintains a collection of active queries. All queries are part of it, but the hidden ones have this field set based on tree.HiddenFromShowQueries. The method that adds a query to that collection (and creates one of these queryMetas) also returns a cleanup function that removes the query from the collection. This cleanup function will be called once the results are delivered. For parallel queries (which have no results), the cleanup function is called once execution finishes.

This hidden field will be used by yet-to-be-written code that inspects the collection, in lieu of looking at whether the statement implements HiddenFromShowQueries. I've added a comment to it stating that it's set based on that interface.


pkg/sql/pgwire/v3.go, line 1114 at r2 (raw file):

Previously, knz (kena) wrote…

What's this?

removed


Comments from Reviewable

@andreimatei andreimatei force-pushed the executor-state-machine-conn-executor branch 3 times, most recently from 499b681 to 31099de Compare February 12, 2018 15:21
@andreimatei andreimatei requested a review from a team as a code owner February 12, 2018 15:21
@andreimatei andreimatei force-pushed the executor-state-machine-conn-executor branch 2 times, most recently from fa68ecf to 65cf485 Compare February 13, 2018 02:17
@andreimatei andreimatei force-pushed the executor-state-machine-conn-executor branch 2 times, most recently from 42d9f08 to c74c96a Compare February 16, 2018 22:02
@andreimatei
Copy link
Contributor Author

All tests pass! I'll address everything in the review over the weekend.


Review status: 21 of 56 files reviewed at latest revision, 75 unresolved discussions.


Comments from Reviewable

@andreimatei andreimatei force-pushed the executor-state-machine-conn-executor branch 3 times, most recently from 8f94b96 to 7c9b443 Compare February 17, 2018 22:43
@andreimatei
Copy link
Contributor Author

Review status: 11 of 50 files reviewed at latest revision, 75 unresolved discussions, some commit checks pending.


pkg/server/server.go, line 535 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Per our discussion in person, let's make this an environment variable.

Done.


pkg/sql/conn_executor.go, line 173 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/don't/doesn't/

Done.


pkg/sql/conn_executor.go, line 221 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/connExecutor's/connExecutors/

Done.


pkg/sql/conn_executor.go, line 307 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Add a TODO to move these all back into app_stats.go

I don't think they ever belonged there...


pkg/sql/conn_executor.go, line 487 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Can we give this type a better name?

got rid of the type instead


pkg/sql/conn_executor.go, line 495 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why not put this in extraTxnState

done


pkg/sql/conn_executor.go, line 540 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why not put this in extraTxnState?

I don't think this belongs there; it has nothing to do with a transaction.


pkg/sql/conn_executor.go, line 553 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Group this with sessionData

I don't think this has anything to do with sessionData. I've grouped it with the planner.


pkg/sql/conn_executor.go, line 555 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why is this split from sessionData?

removed ApplicationName. What remains is not related to sessionData.


pkg/sql/conn_executor.go, line 586 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It manages leasing tables as well. Why kill it?

I've removed the TODO.
The point is that the management of leases and of cached descriptors in general should not have anything to do with sessions and transactions.


pkg/sql/conn_executor.go, line 619 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

You've described this sufficiently elsewhere, but add a short comment here as well.

Done.


pkg/sql/conn_executor.go, line 653 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

ex.Ctx().Err()?

I think ctx.Err is better. We care whether run()'s ctx has been cancelled specifically, not some other (child) context.


pkg/sql/conn_executor.go, line 670 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: re-ordering these cases would make this easier to understand. For instance, it's strange that Exec is before Prepare and Bind. Look at how v3Conn.serve orders its cases.

I think Exec and ExecPortal should be together. And I like them as the first ones, cause they're the important ones.


pkg/sql/conn_executor.go, line 735 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: set res either before or after ev in all cases. Same with payload. Having an order here allows the reader to create mental patterns about how each case works.

Done.


pkg/sql/conn_executor.go, line 755 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Talk about when it wouldn't be.

Done.


pkg/sql/conn_executor.go, line 809 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Case is impossible.

Done.


pkg/sql/conn_executor.go, line 814 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Again, not possible.

Done.


pkg/sql/conn_executor.go, line 853 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

move this with all the other execXYZ methods and keep them in the same order as the switch statement above.

moved


pkg/sql/conn_executor.go, line 976 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this tested?

removed


pkg/sql/conn_executor.go, line 1130 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This logic is pretty hard to parse. Why stray from what we do in v3conn.handleBind? Also, where'd the comment go?

well so some amount of straying is necessary: it used to be that v3Conn had access to the portal, and so it could see how many arguments it has. In the new world, the conn doesn't have access to the portal, so it enqueues all the information that it has for further processing in connEx.
I've added a comment to bindCmd.ArgFormatCodes and one here. Does this address?


pkg/sql/conn_executor.go, line 1170 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Same, why change the structure?

same, but let's talk if you're not happy
Switched to the conn going for 0 to 1 OutFormats, to mirror what we do for ArgFormatCodes.


pkg/sql/conn_executor.go, line 1208 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

No need for second period.

Done.


pkg/sql/conn_executor.go, line 1368 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why return here?

Done.


pkg/sql/conn_executor.go, line 1371 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why do some of these execStmtIn... methods take contexts and some don't?

now they all take the ctx


pkg/sql/conn_executor.go, line 1384 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/runShowTransactionState/execShowTransactionState/

exec is used for higher-level things... What's wrong with run?


pkg/sql/conn_executor.go, line 1533 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This logic can be simplified now that we can just return err. See #22721.

Simplified how? Isn't the logic now as clean as it gets?


pkg/sql/conn_executor.go, line 1535 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Are these filters? knobs? Let's be consistent.

I don't know what to do... I'll leave it :)


pkg/sql/conn_executor.go, line 1701 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
case *tree.CommitTransaction, *tree.ReleaseSavepoint, ...:
    return ex.makeErrEvent(errNoTransactionInProgress, stmt.AST)

Done.


pkg/sql/conn_executor.go, line 1747 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

You name this different things in different places. os? so?

Done.


pkg/sql/conn_executor.go, line 1822 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

We don't need this for the regular Savepoint case?

no, it's only needed when we want to change the tag of the result. If we created the result by passing tree.Savepoint, it's not needed.


pkg/sql/conn_executor.go, line 1825 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Make a note about why this is different than execPrepare.

Done.


pkg/sql/conn_executor.go, line 1868 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

stmtTS

Done.


pkg/sql/conn_executor.go, line 1909 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

How does this interact with parallel statements?

par stmts use a dedicated planner. It doesn't share this.


pkg/sql/conn_executor.go, line 1940 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Make a note of why we don't return anything. It might even be worth creating

var noEvent fsm.Event
var noPayload fsm.EventPayload

and using these throughout the functions instead of returning a naked nil.

sprinkled some comments here, above, and at the method definition.


pkg/sql/conn_executor.go, line 2043 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

if ex.implicitTxn {

Done.


pkg/sql/conn_executor.go, line 2052 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

New line above.

Done.


pkg/sql/conn_executor.go, line 2093 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why is this in the middle of all these related methods?

moved away


pkg/sql/conn_executor.go, line 2235 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Shouldn't this be called execStmtSerially or something to parallel execStmtInParallel?

I wouldn't say so. First of all, when talking about a single statement, the word "serially" wouldn't make sense.
distpatchToExecutionEngine is not on the same plane as execStmtInParallel. Ideally execInParallel would call this, modulo problems with DistSQL.


pkg/sql/conn_executor.go, line 2311 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

execStmtWithLocalEngine

meh


pkg/sql/conn_executor.go, line 2367 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

execStmtWithDistSQLEngine

meh


pkg/sql/conn_executor.go, line 2477 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

rwMode := tree.ReadWrite
if ex.state.readOnly {
rwMode = tree.ReadOnly
}

We do this in a few places. Worth making it a method?

meh


pkg/sql/conn_executor.go, line 2508 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

These assertions are noise which is starting to add up across this file, and I don't think they get us anything. For instance, you only call this method in the stateCommitWait case of execStmt, so what's the point of the assertion. My vote is to remove them all.

removed


pkg/sql/conn_executor.go, line 2560 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

no need for these breaks.

Done.m


pkg/sql/conn_executor.go, line 2588 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

case txnRestart, txnAborted:

is preferable

Done.


pkg/sql/conn_executor.go, line 2633 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

What do you think about separate files for different connExecutor functionality (similar to what we do for Replica and what we do in the batcheval package). We could split it into:

conn_executor.go
conn_executor_prepare.go
conn_executor_bind.go
conn_executor_exec.go
...

There's enough logic in each of these components that I think the split is warranted. It would also help give some structure to this 3000 line file.

I've created an conn_executor_prepare.go and moved execPrepare(), execBind() and other methods having to do with prepared statements and portals there.
I don't think more files beyond this would be an improvement.


pkg/sql/conn_executor.go, line 2640 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why is this exported? Why is it split from execPrepare?

not exported any more
You mean why is it split from addPreparedStmt()? I think it makes sense to be split - ththat one modifies state on the ex, this one doesn't. It's pretty large too.


pkg/sql/conn_io.go, line 321 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

ctx isn't used.

Done.


pkg/sql/conn_io.go, line 347 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this used?

removed


pkg/sql/conn_io.go, line 389 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nil out buf.mu.data[0] so that we're not holding on to memory.

Done.


pkg/sql/conn_io.go, line 520 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

You're going to remove all of these !!! parts, right?

Done.


pkg/sql/conn_io.go, line 570 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

It also extends ResultBase

Done. Shuffled interfaces around, PTAL.


pkg/sql/executor.go, line 2250 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Comment.

Done.


pkg/sql/results_writer.go, line 259 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this used?

removed. This was from when I wanted to make this bufferedWriter implement the ncessary interfaces for collecting results, but I abandoned that. Something will be done when I figure out how to use the new code for the InternalExecutor.


pkg/sql/run_control_test.go, line 91 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

This test less than great

So is this grammar 😉

Did you mean "so this grammar"?


pkg/sql/txn_state_test.go, line 775 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

The name is not in line with the other ones now.

added spaces if that was what you wanted. Names here follow a couple of patterns.


pkg/sql/pgwire/command_result.go, line 53 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit across the entire change that has begun to really sticks out to me over the course of this review: structs where fields are logically ordered and grouped make them significantly easier to understand and compartmentalize.

Take Store as an example. The struct has over 50 fields, but they're grouped in a way where any individual change only needs to consider one group. For instance, all of the queues are grouped together. Below there, all of the replica state is grouped together.

Comparing that to some of the structs here, I find that these end up looking like grab bags of random chunks of unrelated data. It can be really hard to pick out what's important and what the primary responsibilities of the types are when there is minimal field grouping, no discernible ordering across fields, and what can seem like randomly varying levels of detail put into each fields' comment. The same could be said about the grouping and ordering of methods within these files, like I think I pointed out in conn_executor.go.

This may just be me, but I find this makes remembering how each of these pieces fit together substantially harder, especially for anyone who isn't the author. This is only going to get less cohesive from here, so it's really important that we make the object hierarchy and general structure of this new chunk of code understandable to start with.

Hopefully I've improved things in the connExectur.
Rearranged fields here too. This struct is unfortunately a bit messy because it represents multiple different types of results. I don't think it's worth improving though; most fields here are isolated from each other and the whole thing is not too complicated. Speak if you feel otherwise.


pkg/sql/pgwire/command_result.go, line 137 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Why not just put the switch in a if r.err == nil condition? There's no reason to start using gotos in our code. We've survived for 3 years without them :)

see nowenv


pkg/sql/pgwire/command_result.go, line 188 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Pull this callback interation into some other method and call wherever it's needed.

meh, I don't think it's worth it


pkg/sql/pgwire/conn.go, line 384 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Are you going to address this in this change? If not, let's open an issue for it.

Referenced #22630


pkg/sql/pgwire/conn.go, line 660 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this logic duplicated in conn_executor.go?

Not really - this code only decodes what the client sent. The connExecutor than meshes it with the columns that actually exist in the results of the prepared stmt.


pkg/sql/pgwire/conn.go, line 743 at r6 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

util.FastIntMap?

FastIntMap seems to be about keys being small - which here they generally aren't. CmdPos keeps increasing for the life of a connection.
Sent out #22792 to clarify comment on FastIntMap.


Comments from Reviewable

@andreimatei andreimatei force-pushed the executor-state-machine-conn-executor branch from 7c9b443 to e0686bc Compare February 18, 2018 17:15
@andreimatei
Copy link
Contributor Author

Review status: 11 of 50 files reviewed at latest revision, 75 unresolved discussions, all commit checks successful.


pkg/sql/pgwire/conn.go, line 384 at r6 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Referenced #22630

Did something here.


Comments from Reviewable

@nvanbenschoten
Copy link
Member

I left a few more comments that need to be addressed before this can be merged, but besides them the rest :lgtm:. Let's get this in and cherry-picked!


Reviewed 17 of 31 files at r7, 25 of 33 files at r8.
Review status: all files reviewed at latest revision, 19 unresolved discussions, all commit checks successful.


pkg/sql/conn_executor.go, line 670 at r4 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I think Exec and ExecPortal should be together. And I like them as the first ones, cause they're the important ones.

Whatever order you think is best is fine, but the order we consume Commands here and the order in which the switch statement in conn.serveImpl generates them should be the same.


pkg/sql/conn_executor.go, line 1533 at r4 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Simplified how? Isn't the logic now as clean as it gets?

Get rid of skipCommit and err. Since we don't care about the VEventf for injected errors we can just return whenever err != nil. See https://github.com/cockroachdb/cockroach/pull/22721/files#diff-d9528d5f2c652eaedc7f587fda1123bcR1122.


pkg/sql/conn_executor.go, line 1701 at r4 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Done.

So that you can get rid of the goto!


pkg/sql/conn_executor.go, line 1822 at r4 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

no, it's only needed when we want to change the tag of the result. If we created the result by passing tree.Savepoint, it's not needed.

Please put this in a comment here.


pkg/sql/conn_executor.go, line 432 at r8 (raw file):

		dbCacheSubscriber: s.dbCache,
	}
	ex.extraTxnState.schemaChangers = schemaChangerCollection{}

Is this needed? Isn't it the zero value?


pkg/sql/conn_executor.go, line 454 at r8 (raw file):

	defer func() {
		if r := recover(); r != nil {
r := recover()
ex.closeWrapper(ctx, r)

pkg/sql/conn_executor.go, line 1231 at r8 (raw file):

// 	 then the statement cannot have any placeholder.
// pos: The position of stmt.
func (ex *connExecutor) execStmt(

I still think we should move all methods related to execStmt (all of the runXYZ methods) to a conn_executor_exec.go file, but that doesn't need to block this PR.


pkg/sql/conn_io.go, line 259 at r8 (raw file):

// Flush is a Command asking for the results of all previous commands to be
// delivered to the client.
type Flush struct{}

Also, the order in which each of these Commands/CommandResults is defined should also match the order in which the switch statements generate/consume them.


pkg/sql/pgwire/command_result.go, line 53 at r4 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Hopefully I've improved things in the connExectur.
Rearranged fields here too. This struct is unfortunately a bit messy because it represents multiple different types of results. I don't think it's worth improving though; most fields here are isolated from each other and the whole thing is not too complicated. Speak if you feel otherwise.

This is much better. Thanks!


Comments from Reviewable

Release note: None

The connExecutor is the top dog that interfaces with a pgwire connection
(through the clientComm and CommandResult interfaces), consumes a stream
of queries and produces a stream of results. It encapsulates the
connection state machine for which it produces events. It interfaces
with the two SQL execution engines for actually running queries. Its
main responsibility is to dispatch statements based on the current state
(in txn, not in txn, in aborted txn, etc) and to handle execution in all
states but Open (in Open it talks to an execution engine). It also
handles other commands than executiong queries: preparing, binding, etc
and it maintains the session state associated with prepares statements.
@andreimatei andreimatei force-pushed the executor-state-machine-conn-executor branch from e0686bc to 7e38855 Compare February 19, 2018 15:38
@andreimatei
Copy link
Contributor Author

Review status: 48 of 52 files reviewed at latest revision, 11 unresolved discussions.


pkg/sql/conn_executor.go, line 670 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Whatever order you think is best is fine, but the order we consume Commands here and the order in which the switch statement in conn.serveImpl generates them should be the same.

Done.


pkg/sql/conn_executor.go, line 1533 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Get rid of skipCommit and err. Since we don't care about the VEventf for injected errors we can just return whenever err != nil. See https://github.com/cockroachdb/cockroach/pull/22721/files#diff-d9528d5f2c652eaedc7f587fda1123bcR1122.

Done.


pkg/sql/conn_executor.go, line 1701 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

So that you can get rid of the goto!

Done.


pkg/sql/conn_executor.go, line 1822 at r4 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Please put this in a comment here.

Well I don't want to put a comment whenever this method is called. It thing is explained on the method comment, and also the name having "Reset" in it is suggestive enough I think.


pkg/sql/conn_executor.go, line 432 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this needed? Isn't it the zero value?

Done.


pkg/sql/conn_executor.go, line 454 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
r := recover()
ex.closeWrapper(ctx, r)

Done.


pkg/sql/conn_executor.go, line 1231 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I still think we should move all methods related to execStmt (all of the runXYZ methods) to a conn_executor_exec.go file, but that doesn't need to block this PR.

Done.


pkg/sql/conn_io.go, line 259 at r8 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Also, the order in which each of these Commands/CommandResults is defined should also match the order in which the switch statements generate/consume them.

Done.


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Feb 19, 2018

:lgtm: modulo resolution of nathan's comments.
I have a few comments too but these can be addressed in a follow-up PR.


Reviewed 10 of 56 files at r3, 2 of 12 files at r5, 15 of 31 files at r7, 32 of 33 files at r8, 2 of 4 files at r9.
Review status: 50 of 52 files reviewed at latest revision, 18 unresolved discussions, some commit checks pending.


pkg/sql/app_stats.go, line 291 at r8 (raw file):

) []roachpb.CollectedStatementStatistics {
	s.Lock()
	defer s.Unlock()

The locking did not need to change in this PR.


pkg/sql/conn_executor.go, line 628 at r8 (raw file):

		// prep stmts as they were. We do this by taking a snapshot every time
		// txnRewindPos is advanced. Prepared statements are shared between the two
		// collections, but these collections are periodically reconciled.

Define "periodically" here.


pkg/sql/conn_executor.go, line 1055 at r8 (raw file):

		if advInfo.code != advanceOne {
			panic(fmt.Sprintf("unexpected advanceCode: %s", advInfo.code))

I'd rather you have this log/report an error and close the connection instead of shutting down the entire node.

Otherwise still, log.Fatalf is slightly better than panic.


pkg/sql/conn_executor.go, line 1447 at r8 (raw file):

	if retriable {
		if _, inOpen := ex.machine.CurState().(stateOpen); !inOpen {
			panic(fmt.Sprintf("retriable error in unexpected state: %#v",

ditto log.Fatalf


pkg/sql/conn_executor.go, line 2259 at r8 (raw file):

		})
		if commErr != nil {
			return commErr

SetError(commErr) here - the exec logs must show this information.


pkg/sql/results_writer.go, line 310 at r8 (raw file):

func (b *bufferedWriter) CloseResult() error {
	if !b.resultInProgress {
		panic("no result in progress")

return pgerror.NewErrorf(pgerror.CodeInternalError, "programming error: no result in progress") ?


pkg/sql/txn_restart_test.go, line 362 at r8 (raw file):

	if shouldAbort {
		if err := ta.abortTxn(ri.key); err != nil {
			panic(fmt.Sprintf("TxnAborter failed to abort: %s", err))

return an error + fail the connection, do not panic


pkg/sql/pgwire/command_result.go, line 212 at r8 (raw file):

	}
	if r.err != nil {
		panic("can't send row after error")

log / report error but avoid shutting down node


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Feb 19, 2018

In general I think we have a general panic problem and this PR is making it worse overall. This will need to be addressed / improved upon. But do not let this block this PR.


Review status: 50 of 52 files reviewed at latest revision, 18 unresolved discussions, some commit checks pending.


Comments from Reviewable

@andreimatei
Copy link
Contributor Author

I thought in SQL we like panics over Fatalf because of that panic handler that anonymizes the query and sends a crash report, no?
Generally this PR reduces the number of fatals/panics in this module, because a lot of things return errors and close the connection (which before wasn't an option; most code did not have a way to close the connection). But there are still some; sometimes it was a judgement call.


Review status: 50 of 52 files reviewed at latest revision, 18 unresolved discussions, some commit checks pending.


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Feb 19, 2018

I do not have a clean answer to the question of panic vs fatalf. I do believe however that both/either should be avoided and replaced by an error log + reported event + conn close, without crashing the node. This PR does not move us in that direction, and actually moves us a bit further because the code path needed to make this happen now becomes more complex. I suggest we work on this in the coming two months.


Review status: 50 of 52 files reviewed at latest revision, 18 unresolved discussions, some commit checks pending.


Comments from Reviewable

@andreimatei
Copy link
Contributor Author

Review status: 50 of 52 files reviewed at latest revision, 18 unresolved discussions, all commit checks successful.


pkg/sql/app_stats.go, line 291 at r8 (raw file):

Previously, knz (kena) wrote…

The locking did not need to change in this PR.

how do you mean? It didn't really change...


pkg/sql/conn_executor.go, line 2259 at r8 (raw file):

Previously, knz (kena) wrote…

SetError(commErr) here - the exec logs must show this information.

this is done at a higher layer, in dispatchToExecutionEngine()


pkg/sql/txn_restart_test.go, line 362 at r8 (raw file):

Previously, knz (kena) wrote…

return an error + fail the connection, do not panic

testing code, doesn't really matter


Comments from Reviewable

@andreimatei
Copy link
Contributor Author

This PR does not move us in that direction, and actually moves us a bit further because the code path needed to make this happen now becomes more complex.

Not sure what you're referring to :). A lot of panics have been transformed to errors, like you say. I think it most definitely takes us in the direction you want.
OTOH, I think we should exaggerate with that. I think generally most "unexpected" things should be panics. Specific unexpected cases that seem unlikely to indicate corruption beyond one connection can be tolerated more.
But even then it's a judgement call and not everybody chooses the same - for example, Nathan felt pretty strongly that an unexpected event received by the state machine should result in a panic.


Review status: 50 of 52 files reviewed at latest revision, 18 unresolved discussions, all commit checks successful.


Comments from Reviewable

@knz
Copy link
Contributor

knz commented Feb 19, 2018

Nathan felt pretty strongly that an unexpected event received by the state machine should result in a panic.

I fail to agree as long as the connection is closed immediately and any currently open kv txn is not given a chance to commit.


Reviewed 1 of 4 files at r9.
Review status: 51 of 52 files reviewed at latest revision, 17 unresolved discussions, all commit checks successful.


pkg/sql/app_stats.go, line 291 at r8 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

how do you mean? It didn't really change...

The party line is to avoid defers if there's a single control path out of the critical section. There was no need to change that here.


pkg/sql/conn_executor.go, line 2259 at r8 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

this is done at a higher layer, in dispatchToExecutionEngine()

Thanks.


pkg/sql/txn_restart_test.go, line 362 at r8 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

testing code, doesn't really matter

'doh. Carry on.


Comments from Reviewable

@petermattis
Copy link
Collaborator

@andreimatei, @knz See #16479 for a reason to prefer log.Fatal over panic. If only panics include anonymized queries, we could investigate ways to alleviate that. This comment is not intended to motivate changing anything in this PR.


Review status: 51 of 52 files reviewed at latest revision, 15 unresolved discussions, all commit checks successful.


Comments from Reviewable

@andreimatei
Copy link
Contributor Author

Merging this and the cherry-pick. Will address the remaining comments here in another PR.


Review status: 51 of 52 files reviewed at latest revision, 15 unresolved discussions, all commit checks successful.


Comments from Reviewable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants