-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: sending the wrong drain signal #22630
Comments
I think that's fine. The postgres docs also say that:
So it seems that both are accepted. The decision to send back a I think it might be good to support sending back Re the balancer issues, the issue isn't our implementation of the postgres protocol, but a mix of performing a rolling upgrade too quickly and not transferring raft leadership. There definitely might be issues down the line, but I don't think this is one of them. |
None of these two sources seem to suggest to me that sending an error to an unsuspecting client is fine. |
In the
This I don't want to talk too much about pgpool, because I don't think that it is a load balancer that we should be focusing on (I'm skeptical about whether it is actually used in production instead of e.g. AWS' ELB or haproxy) but when
|
What we're talking about is not whether the |
I've traced a Postgres 10 server and it would appear that it always sends What we do when we're in the middle of a query is a bit different - we send a "context canceled" error with the internal error code and then we send the It's interesting that Postgres doesn't send squat when the operator does |
@andreimatei writes:
I believe @asubiotto you're looking at some draining issues where the integration with some balancer isn't working right; I've stumbled upon the signals we're sending clients on draining, and I think they're wrong. Maybe this explains why the balancer isn't playing nice.
TL;DR I think we're sending the wrong message type to notify clients of the draining.
The protocol docs says this:
It is possible for NoticeResponse messages to be generated due to outside activity; for example, if the database administrator commands a “fast” database shutdown, the backend will send a NoticeResponse indicating this fact before closing the connection. Accordingly, frontends should always be prepared to accept and display NoticeResponse messages, even when the connection is nominally idle.
However, our code is this:
cockroach/pkg/sql/pgwire/server.go
Line 429 in 679c565
We're not sending a "NoticeResponse"; we're sending a plain error packet. I believe they're different.
Besides potentially not talking the language that balancers and drivers understand, I think this might also break clients - in my testing of other stuff I've noticed that lib/pq, for one, freaks out when it receives an unexpected error (i.e. when it gets an error but it's not in the context of running the specific types of commands that permit errors). On the other hand, clients apparently need to accept NoticeResponses at all times.
Does this sound right?
The text was updated successfully, but these errors were encountered: