Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[#24317] docdb: Fix race between commit/abort and old transaction cle…
…anup Summary: There exists a race condition between commit/abort path and old transaction cleanup for promoted transactions, where commit/abort path observes that old transaction cleanup is still ongoing and sets `cleanup_waiter_`, but old transaction cleanup finishes before `cleanup_waiter_` is set, resulting in the waiter never getting called. This is the cause of occasional failures of GeoTransactionsPromotionTest.TestPromotionReturningToAbortedState and GeoTransactionsPromotionRF1Test.TestTwoTabletPromotionFailure with the following stack: ``` F20241007 19:23:50 ../../src/yb/rpc/rpc.cc:339] Check failed: calls_.empty() Calls: [0x000035bd35b49d60 -> AbortTransaction: tablet_id: "36df9b80658448848075dd10894f489c" transaction_id: "`\2051\'S&K\333\267[<\276\373P\302\207" propagated_hybrid_time: 7079338864676978688, retrier: { task_id: -1 state: kFinished deadline: 314126.220s }] *** Check failure stack trace: *** @ 0x7fa6097965c0 google::LogMessage::SendToLog() @ 0x7fa609796c00 google::LogMessage::Flush() @ 0x7fa609799979 google::LogMessageFatal::~LogMessageFatal() @ 0x7fa60990b545 yb::rpc::Rpcs::Shutdown() @ 0x7fa60b40216d yb::client::TransactionManager::Impl::~Impl() @ 0x7fa60b3fdfad yb::client::TransactionManager::~TransactionManager() @ 0x7fa60cfe6f6e yb::tserver::DbServerBase::~DbServerBase() @ 0x7fa60d0b702e yb::tserver::TabletServer::~TabletServer() @ 0x7fa60e3e3393 yb::tserver::MiniTabletServer::Shutdown() @ 0x7fa60e561a05 yb::MiniCluster::Shutdown() @ 0x7fa60e595aca yb::YBMiniClusterTestBase<>::DoTearDown() @ 0x7fa60de00c1d testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x7fa60dde78c7 testing::TestInfo::Run() @ 0x7fa60dde8575 testing::TestSuite::Run() @ 0x7fa60ddf7e4e testing::internal::UnitTestImpl::RunAllTests() @ 0x7fa60de0190d testing::internal::HandleExceptionsInMethodIfSupported<>() @ 0x7fa60ddf795f testing::UnitTest::Run() @ 0x7fa60de81177 main @ 0x7fa607a29d90 (unknown) @ 0x7fa607a29e40 __libc_start_main @ 0x55f94fc7e325 _start ``` Jira: DB-13207 Test Plan: Jenkins. Also ran GeoTransactionsPromotionTest.TestPromotionReturningToAbortedState 500x and ensured above stack did not appear. Reviewers: sergei Reviewed By: sergei Subscribers: rthallam, ybase Differential Revision: https://phorge.dev.yugabyte.com/D38789
- Loading branch information