-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large / slow transaction terminates process without throwing exception #658
Comments
That error in your db logs is if the client connection is severed with no more information. That should result in a connection error from your |
I wrapped nearly everything in try catch as much as i can and none of the exception handlers get triggered. |
I also use multiple transactions in concurrency. Forgot to mention that.
…On Fri, Aug 18, 2023 at 9:05 PM David Danialy ***@***.***> wrote:
I am experiencing something which may be related - I have an application
which calls a transaction-opening function at high volume. I have observed
the following behavior multiple times:
1. Transaction A begins
2. Transaction A begins an upsert query
3. Transaction B begins immediately after
4. Transaction B begins the same upsert query on a different row
5. Transaction A returns from the query
6. Transaction A proceeds normally, executing another insert and
committing
7. Transaction B never resolves the initial query, and the transaction
remains open in the database
I can see this being expressed in AWS RDS performance insights, where the
query is shown as load on Client:ClientRead, meaning that the transaction
remains open, waiting for the client to read the data:
[image: image]
<https://user-images.githubusercontent.com/35443754/261684990-13d80b4f-c6ab-4f43-b6f3-29ed5c55eb57.png>
It seems like the promise never resolves, or the process dies and doesn't
throw an error/resolve the promise of the transaction function to close it
out.
I have mitigated this for the time being by using a mutex lock on that
function, but the ability for concurrency on that function is necessary for
scaling.
Here is the flow of the code for that function: await
this.sql.begin(async (sql) => { // the upsert query variable = await sql`
INSERT... ` if (!variable) { throw new Error( `upsert returned null` ) } if
(boolean condition) { variable2.property= variable.property } // the insert
query variable3 = await sql` INSERT INTO "table2"... ` if (!variable3) {
throw new Error(`insert returned null`) } return })
—
Reply to this email directly, view it on GitHub
<#658 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABDF7BGI7YNKJ7IXDUIXBTXV64IVANCNFSM6AAAAAA3KXQWV4>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Do you know if your issue happens only during concurrent transactions? |
Is it possible that the underlying TCP connection got silently dropped? AWS does that after a few minutes of inactivity if you didn't set up TCP keepalives. |
i don't see how that could be the root cause of this issue. it can be an adjacent issue, but in any case that shouldn't cause the client process to exit with return code 0 completely outside of javascript control flow. |
Just swapped the query to execute through a different library and it works fine. There is almost certainly some bug in this library causing connection drops |
Yesterday I had a similar issue. I was using PostgresJs over a Drizzle-orm (SSL) connection to Postgres 16 (Neon) located very far from my local development server (for testing purposes). |
I am currently debugging a similar issue. It is fairly consistent to reproduce in my case but seems to be related to slow queries/connections/servers. I'll explain: I have a local test server in Denmark. I also have a production server and a Postgres server in Netherland. The production server and Postgres server runs within the same network and therefore has a fast connection. Due to the distance between the test server in Denmark and the Postgres server in Netherland, the connection between those could be rather slow. What I noticed is that the test server gets into a state after a while where the queries no longer resolves. So all the queries will stay pending and they will also not throw any errors which causes the services to timeout. Debugging from the Postgres server indicates that the connection is closed and that the server doesn't receive anything from the client. Restarting the app resolves the issues until it happens again. Exactly why this happens I don't know yet but seems to be related to the closing and opening of connections. Lowering the 'idle_timeout' to 30 makes the test server get into the broken state faster(15 min). However, the exact same code running on the production server haven't experience these issues. I'll be investigating a bit more, feel free to reach out if I can be of any assistance. I guess the first things to look into would be to avoid the queries failing silently, to ensure that they throws an error if they can't succeed :-) |
I finally found a little time to dig into this, and there is indeed no handling in transactions currently if the connection drops while no query is running. If the connection drops the transaction is broken, so we should of course throw in such a scenario. I'm working on a fix now. |
I don't have a minimal reproduction yet, but I wanted to report this issue anyways.
Basically what I'm trying to do is insert 100,000 of entries into a database as a transaction with rollback in case I have a data mapping issue in between.
My code is roughly:
The output I get is:
I started looking a bit inside the begin implementation, inside scope, this promise never resolves or rejects.
On my postgres server I get the following error in the log:
The interesting part here is that the promise seemingly just stops executing and then the process exits. I wonder if this is a load balancing issue on my postgres server, will need to investigate more. In any case I would expect at least a client side error message.
The text was updated successfully, but these errors were encountered: