-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storcon: switch to diesel-async and tokio-postgres #10280
Conversation
7429 tests run: 7043 passed, 0 failed, 386 skipped (full report)Flaky tests (5)Postgres 17
Postgres 15
Postgres 14
Code coverage* (full report)
* collected from Rust tests only The comment gets automatically updated with the latest test results
84230b8 at 2025-01-27T14:10:07.668Z :recycle: |
not sure what the cause could be, I did a 1:1 translation, except that we don't start a transaction for running the migrations because of weiznich/diesel_async#115 . but that happens on an entirely separate connection from the reconciler where the issue occurs. |
I was curious so I tried with 1.85 with AsyncFn. It almost works, but there seems to be a limitation in proving |
I had |
9775558
to
2f14a6e
Compare
Made it use r2d2 instead, but no avail, now it times out in the |
2f14a6e
to
a661209
Compare
Standing on 94ab6c5, with debug logging added, I can see the following when test_check_visibility_map fails:
in storage_controller_db/postgres.log:
So it looks like we can't recover after SerializationFailure. |
Probably something like this:
could fix the issue. At least it makes the test pass reliably for me, even when " ERROR: could not serialize access due to read/write dependencies among transaction" found in storage_controller_db/postgres.log. |
a661209
to
8e7ce7a
Compare
Yup, the test works now when I apply the change. Good, then we don't need r2d2 and can use bb8 instead (bb8 is properly async, r2d2 which we use right now only works via blocking wrappers). I think this means that once tests pass, we can merge this PR. Thank you so much @alexanderlaw for identifying where the issue was. |
By the way, maybe you would like also to get rid of Lines 67 to 68 in 9f1408f
within this PR, as PQ_LIB_DIR is specified there only for pq-sys, which should not be used after this change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Nice to have one less spawn_blocking!
I didn't review the modified Persistence functions line-by-line, presuming they are all just wrapping in box-futures.
I took a look at our new dependencies' repos & they all seem to be actively maintained projects.
Should we remove
|
@bayandin I agree we should remove it from the other places too, I'll file a followup. |
…10592) There was a regression of #10280, tracked in [#23583](neondatabase/cloud#23583). I have ideas how to fix the issue, but we are too close to the release cutoff, so revert #10280 for now. We can revert the revert later :).
Update to a rebased version of our rust-postgres patches, rebased on [this](sfackler/rust-postgres@98f5a11) commit this time. With #10280 reapplied, this means that the rust-postgres crates will be deduplicated, as the new crate versions are finally compatible with the requirements of diesel-async. Earlier update: #10561 rust-postgres PR: neondatabase/rust-postgres#39
Successor of #10280 after it was reverted in #10592. Re-introduce the usage of diesel-async again, but now also add TLS support so that we connect to the storcon database using TLS. By default, diesel-async doesn't support TLS, so add some code to make us explicitly request TLS. cc neondatabase/cloud#23583
Switches the storcon away from using diesel's synchronous APIs in favour of `diesel-async`. Advantages: * less C dependencies, especially no openssl, which might be behind the bug: https://github.com/neondatabase/cloud/issues/21010 * Better to only have async than mix of async plus `spawn_blocking` We had to turn off usage of the connection pool for migrations, as diesel migrations don't support async APIs. Thus we still use `spawn_blocking` in that one place. But this is explicitly done in one of the `diesel-async` examples.
We now don't need libpq any more for the build of the storage controller, as we use `diesel-async` since neondatabase#10280. Therefore, we remove the env var that gave cargo/rustc the location for libpq. Follow-up of neondatabase#10280
…ase#10280)" (neondatabase#10592) There was a regression of neondatabase#10280, tracked in [#23583](https://github.com/neondatabase/cloud/issues/23583). I have ideas how to fix the issue, but we are too close to the release cutoff, so revert neondatabase#10280 for now. We can revert the revert later :).
Update to a rebased version of our rust-postgres patches, rebased on [this](sfackler/rust-postgres@98f5a11) commit this time. With neondatabase#10280 reapplied, this means that the rust-postgres crates will be deduplicated, as the new crate versions are finally compatible with the requirements of diesel-async. Earlier update: neondatabase#10561 rust-postgres PR: neondatabase/rust-postgres#39
…0614) Successor of neondatabase#10280 after it was reverted in neondatabase#10592. Re-introduce the usage of diesel-async again, but now also add TLS support so that we connect to the storcon database using TLS. By default, diesel-async doesn't support TLS, so add some code to make us explicitly request TLS. cc https://github.com/neondatabase/cloud/issues/23583
Switches the storcon away from using diesel's synchronous APIs in favour of
diesel-async
.Advantages:
spawn_blocking
We had to turn off usage of the connection pool for migrations, as diesel migrations don't support async APIs. Thus we still use
spawn_blocking
in that one place. But this is explicitly done in one of thediesel-async
examples.