-
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RPC server: thread safety (+ small fix to on_getblockhash) #7936
RPC server: thread safety (+ small fix to on_getblockhash) #7936
Conversation
- grab an lmdb read transaction guard to ensure the thread executing on_get_info reads consistent data from the db + other low hanging fruit
If a readtxn is being held, then that's not possible. |
Right and I was thinking it seems like you would want it to be possible, which is why I didn't use a readtxn there, or in other places that behaved similarly. Edit: sorry, poorly worded on my part. |
No, you wouldn't want that. You should want all state changes to be all-or-nothing, with no visible in-between conditions. Meanwhile, for the theoretical case of a key image appearing after the client checked - that could happen anyway, just depending on luck and the exact times a txn arrives vs a client querying. |
- fixed on_getblockhash error resp return when requested height >= blockchain height
12b885f
to
38e1ec1
Compare
I have two question regarding this PR: First, is this still really waiting for a review, is there still a goal to get this (or its "master" branch equivalent) merged eventually, or did some reason surface for not pursuing this further? If it's still waiting for review I may try my luck and hopefully help towards a goal of getting it merged. Second, out of curiosity: Does this aim to work towards a state where several clients can share a single RPC daemon instance without problems? If yes, does this already come close, or is still a lot of work waiting for later to achieve that? |
It's simply missing a review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
I checked whether there are "on_" procedures that set a lock now thanks to these changes and call other "on_" procedures that in turn try to set their own lock which then would probably result in a deadlock, but did not find any.
I approve already because I did not find any really important changes to propose.
src/rpc/core_rpc_server.cpp
Outdated
@@ -2117,6 +2127,7 @@ namespace cryptonote | |||
return r; | |||
|
|||
CHECK_PAYMENT_MIN1(req, res, COST_PER_BLOCK_HEADER, false); | |||
db_rtxn_guard rtxn_guard(&m_core.get_blockchain_storage().get_db()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be moved a bit further down, about 8 lines, to after the test about restriction.
src/rpc/core_rpc_server.cpp
Outdated
@@ -2185,6 +2196,7 @@ namespace cryptonote | |||
if (use_bootstrap_daemon_if_necessary<COMMAND_RPC_GET_BLOCK_HEADERS_RANGE>(invoke_http_mode::JON_RPC, "getblockheadersrange", req, res, r)) | |||
return r; | |||
|
|||
db_rtxn_guard rtxn_guard(&m_core.get_blockchain_storage().get_db()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course not terribly important, but if we are at it anyway already, the test about restriction could come first, then the lock, and only then the check against current blockchain height.
@@ -2606,6 +2622,7 @@ namespace cryptonote | |||
bool core_rpc_server::on_get_coinbase_tx_sum(const COMMAND_RPC_GET_COINBASE_TX_SUM::request& req, COMMAND_RPC_GET_COINBASE_TX_SUM::response& res, epee::json_rpc::error& error_resp, const connection_context *ctx) | |||
{ | |||
RPC_TRACKER(get_coinbase_tx_sum); | |||
db_rtxn_guard rtxn_guard(&m_core.get_blockchain_storage().get_db()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could get the blockchain db locked for an awfully long time, which might hold up all other things quite a bit, but at least the sum would be correct to the piconero :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's using a read-only LMDB transaction to ensure consistent reads of the db, which doesn't lock out writers. From the LMDB docs:
as long as a transaction is open, a consistent view of the database is kept alive, which requires storage. A read-only transaction that no longer requires this consistent view should be terminated (committed or aborted) when the view is no longer needed (but see below for an optimization).
There can be multiple simultaneously active read-only transactions but only one that can write. Once a single read-write transaction is opened, all further attempts to begin one will block until the first one is committed or aborted. This has no effect on read-only transactions, however, and they may continue to be opened at any time.
I also tested behavior in this PR generally by using usleep
+ a local test env so I can mine blocks quickly. Can test behavior like this for example:
bool core_rpc_server::on_get_coinbase_tx_sum(const COMMAND_RPC_GET_COINBASE_TX_SUM::request& req, COMMAND_RPC_GET_COINBASE_TX_SUM::response& res, epee::json_rpc::error& error_resp, const connection_context *ctx)
{
RPC_TRACKER(get_coinbase_tx_sum);
db_rtxn_guard rtxn_guard(&m_core.get_blockchain_storage().get_db());
const uint64_t bc_height = m_core.get_current_blockchain_height();
printf("Height before sleep: %lu\n", bc_height);
// sleep, then mine some blocks while asleep, then make sure height still reads the same after waking up
usleep(1000 * 1000 * 30);
// can even call status (or other RPC calls) from the daemon while asleep and see *those* would return latest height before this next line prints
printf("Height after sleep: %lu\n", m_core.get_current_blockchain_height());
I could even make an RPC call to on_get_coinbase_sum
again and grab another read txn guard and the daemon proceeds to fall asleep, while the first on_get_coinbase_sum
was still asleep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. So experience from other databases does not carry over 1-to-1 to LMDB, as it can do more in parallel as databases can usually do.
And yeah, your test here really settles the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, LMDB is rather unique in its capabilities.
Thanks @rbrunner7 :)
Hmmmm, sort of. This ensures that a single client's request will have a consistent view of the db. For example in |
Closing in favor of #7937. That PR is to master, this PR is to the release branch |
Overview
From @moneromooo-monero in IRC:
From @hyc :
RPC Functions made thread safe
I used an
rtxnguard
to make the following functions thread safe:on_getblockhash
would return a 0 hash when the height requested exceeds blockchain height.Will make the following thread safe in separate PR(s)
on_get_transactions
, will patch separately as wellEDIT: I updated this PR based on @hyc's comments below. When I first submitted the PR, I only included the
db_rtxn_guard
inget_info
; now the PR includes it in more functions so that there are no visible in-between conditions on db reads. I will wait to squash commits until the PR looks good to go, so changes from my prior commit before @hyc's comments are accessible.I originally included the following comment in the PR description (which is what @hyc was responding to below):
I understand how this logic was not ideal for a number of reasons.
I didn't include
on_is_key_image_spent
in this PR since I realized spent key images are read from memory when checking the pool, so seems it'll need a bigger change to make thread safe.