Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blockchain sync: reduce disk writes from 2 to 1 per tx #9135

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

jeffro256
Copy link
Contributor

@jeffro256 jeffro256 commented Jan 24, 2024

Summary

Pros:

  • During sync, instead of performing 1 write, then 1 read, then one write for each tx in the chain, we just write once. This increases the lifespan of the disk and speeds up badly buffered / not buffered I/O. On a newer NVME and with a Ryzen 9 3900X, blockchain sync was around 3-4% faster. Differences will be more pronounced for systems bottle-necked by disk speed.
  • This PR is backwards compatible to receive NOTIFY_NEW_BLOCK commands, but the code paths between handle_notify_new_block and handle_notify_new_fluffy_block are merged for less code surface and review time.

Cons:

  • Complicated review

Hopefully this will move monerod towards being slightly more workable for hard drives in the future.

Design

New: cryptonote::ver_non_input_consensus()

I have created a function cryptonote::ver_non_input_consensus() in tx_verification_utils that checks all consensus rules for a group of transactions besides the checks in Blockchain::check_tx_inputs(). For Blockchain::handle_block_to_main_chain, this is the condition that txs must satisfy before being attempted to be checked for inputs and added to blocks. This function is the most important component that MUST be correct or otherwise chain splits / inflation could occur. To audit the correctness of this function, start at the function cryptonote::core::handle_incoming_txs() in the old code and step through of the rules checked until the end of the function cryptonote::tx_memory_pool::add_tx(). cryptonote::ver_non_input_consensus() should cover all of those rules.

Modified: core::handle_incoming_tx[s]()

Before, cryptonote::core::handle_incoming_txs() was responsible for parsing all txs (inside blocks and for pool), checking their semantics, passing those txs to the mempool, and notifying ZMQ. Now, cryptonote::core::handle_incoming_txs() is deleted and there is only cryptonote::core::handle_incoming_tx(). cryptonote::core::handle_incoming_tx() is now basically just a wrapper around tx_memory_pool::add_tx(), additionally triggering ZMQ events, and is only called for new transaction notifications from the protocol handler (not block downloads).

Modified: tx_memory_pool::add_tx()

All of the consensus checks besides Blockchain::check_tx_inputs() inside of add_tx() were removed and replaced with a call to cryptonote::ver_non_input_consensus(). The relay checks remain the same.

Modified: Blockchain::add_block()

add_block() now takes a structure called a "pool supplement" which is simply a map of TXIDs to their corresponding cryptonote::transaction and transaction blob. When handle_block_to_main_chain attempts to take transactions from the transaction pool to add a new block, if that fails, then it falls back on taking txs from the pool supplement. The pool supplement has all the non-input consensus rules checked after the PoW check is done. If the block ends up getting handled in Blockchain::handle_alternative_block, then the pool supplement transactions are added to the tx_memory_pool after their respective alt PoW checks.

Modified: t_cryptonote_protocol_handler::handle_notify_new_fluffy_block()

The main difference with this function now is that we construct a pool supplement and pass that to core::handle_incoming_block() instead of calling core::handle_incoming_txs() to add everything to the mempool first.

Modified: t_cryptonote_protocol_handler::try_add_next_blocks()

The changes are very similar to the changes made to handle_notify_new_fluffy_block.

Modified: t_cryptonote_protocol_handler::handle_notify_new_block()

Before, this function has separate handling logic, but now we just convert the NOTIFY_NEW_BLOCK request into a NOTIFY_NEW_FLUFFY_BLOCK request and call handle_notify_new_block with it. This saves us having to make the same changes to both code paths.

@jeffro256
Copy link
Contributor Author

jeffro256 commented Jan 25, 2024

I'm thinking about having core::handle_incoming_txs basically do nothing except pass the tx to tx_memory_pool::add_tx, which then passes the transaction through the verify_pool_supplement tests. This would make the code so much more robust against future discrepancy between changes to the pool rules and the verify_pool_supplement rules, but it would require some more refactoring.

Copy link
Contributor

@vtnerd vtnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stopped my review because I (think) found a breaking change to ZMQ - this no longer broadcasts a transaction first seen in a new block. This case is explicitly mentioned in the docs. It looks like txes are still broadcast while chain sync is occurring, so this breaking change makes things real inconsistent.

I think you'll have to carry around a ZMQ object until handle_main_chain to get this working. This could arguable improve how alternate block txes are handled (by not broadcasting them), but then there is the reorg case where txes are seen for the first time on the reorg.

I'm not certain how hard this is to hack together, and I hope we don't have to revert the docs (and thereby make it hell on downstream projects).

fee = tx.rct_signatures.txnFee;
}

const uint64_t fee = get_tx_fee(tx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now throws on error instead of returning false. Worth a try/catch (or is this verified elsewhere before this) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be verified inside ver_non_input_consensus(), but it's worth double checking

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see another check for the fee, just a check for overflow on inputs and outputs, done separately for each.

I'm not certain how an exception escaping this function alters the behavior of the code (you'd probably have a better idea than me at this point).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In core::check_tx_semantic, we check input overflow, output overflow, but also that inputs sum > outputs sum for v1 transactions.

@jeffro256 jeffro256 force-pushed the bc_sync_skip_mempool branch 2 times, most recently from 10b0c2b to fe370e9 Compare January 26, 2024 23:47
Copy link
Contributor

@vtnerd vtnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little more confident that this will work. But I will probably do a third pass after your responses to these questions.

fullConnections.push_back({context.m_remote_address.get_zone(), context.m_connection_id});
}
LOG_DEBUG_CC(context, "PEER SUPPORTS FLUFFY BLOCKS - RELAYING THIN/COMPACT WHATEVER BLOCK");
fluffyConnections.push_back({context.m_remote_address.get_zone(), context.m_connection_id});
Copy link
Contributor

@vtnerd vtnerd Jan 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is forcing fluffy blocks on someone that explicitly requested no fluffy blocks. But losing chain sync until they disable the flag is basically the same thing with more steps.

Copy link
Collaborator

@0xFFFC0000 0xFFFC0000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeffro256 Putting the benchmark results I sent in DM here, until we find which operations actually causing the slow-down. results-2500blocks-5iter.txt

@0xFFFC0000
Copy link
Collaborator

New performance results: the performance problem of pop_blocks specifically related to this PR has been fixed in the new push. The only remaining part is we still have a little bit of performance drop for sync operation. I am attaching the file in case anyone wants to check it.

results-2500blocks-5iter-v2.txt

@jeffro256
Copy link
Contributor Author

I think I've found a reason why the sync time of this PR looks slower than the sync time of master that test script: between the call to pop_blocks and flush_txpool, which is several seconds in some cases, the master node can use the popped txs already inside the mempool to skip most the checks (especially Blockchain::check_tx_inputs) before validating a block which gives it a significant boost. This state won't happen organically during normal sync, so this test script doesn't quite capture the normal behavior during sync when you didn't already have the txs in the mempool.

To fix the script, instead of doing:

  1. pop_blocks
  2. flush_txpool
  3. Wait for sync

You could do:

  1. Start monerod offline
  2. pop_blocks
  3. flush_txpool
  4. stop_daemon
  5. Start monerod online
  6. Wait for sync

This does have the downside of including the start-up time as the sync time, and the choice of peers on new instance may affect the speed at which it syncs, but you could minimize these effects by increasing the number of popped blocks.

Copy link
Contributor

@vtnerd vtnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking pretty good. Mainly curious about your response to one more ZMQ related thing - I think we'll have to accept it as a new "feature" of the design.

@@ -1196,7 +1198,7 @@ bool Blockchain::switch_to_alternative_blockchain(std::list<block_extended_info>
block_verification_context bvc = {};

// add block to main chain
bool r = handle_block_to_main_chain(bei.bl, bvc, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marker for me to remember to investigate further. The false bool was to prevent notifications of a failed reorg - hopefully the new code retains the same consideration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I have an open question about this - see #6347 . But it looks like notify was being ignored, and this is just cleanup on that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notify was being ignored with the newest versions of master, this PR reflects that behavior, but I don't know if this was the original intended behavior... it looks like it isn't

@@ -390,20 +329,29 @@ namespace cryptonote
++m_cookie;

MINFO("Transaction added to pool: txid " << id << " weight: " << tx_weight << " fee/byte: " << (fee / (double)(tx_weight ? tx_weight : 1)) << ", count: " << m_added_txs_by_id.size());
if (tvc.m_added_to_pool && meta.matches(relay_category::legacy))
m_blockchain.notify_txpool_event({txpool_event{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this now notifies during a "return" to txpool call, where it wasn't being notified in that situation previously. The documentation doesn't list anything about this particular case, so we may have to accept this change. Its a rather rare edge case anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make this conditional on !kept_by_block which would prevent notifications on return to txpool and reorgs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind that comment, this would cause alt block handling to not notify

fee = tx.rct_signatures.txnFee;
}

const uint64_t fee = get_tx_fee(tx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see another check for the fee, just a check for overflow on inputs and outputs, done separately for each.

I'm not certain how an exception escaping this function alters the behavior of the code (you'd probably have a better idea than me at this point).

const crypto::hash &tx_hash = new_block.tx_hashes[tx_idx];

blobdata tx_blob;
std::vector<blobdata> tx_blobs;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nitpick on performance, you can move tx_blobs and missed_txs before the loop, and .clear() right here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a readability thing for me, but I personally don't like making the scope of variables any wider than it needs to be, especially for such an already complex function. If there's evidence that it measurably impacts performance, however, I would definitely be okay changing it to what you're suggesting.

relevant link

@jeffro256
Copy link
Contributor Author

Okay @vtnerd the last commits should hopefully handle ZMQ tx notifications better. We only notify A) when an incoming relayed transaction is new and added to the pool, B) a tx from a pool supplement was used to add a block, or C) an alt block contained a new tx and it was added to the pool.

Copy link
Collaborator

@j-berman j-berman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking solid -- I have mostly nits + 1 comment on the latest zmq change


// Cache the hard fork version on success
if (verified)
ps.nic_verified_hf_version = hf_version;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ps is const?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nic_verified_hf_version is marked mutable


const std::unordered_set<crypto::hash> blk_tx_hashes(blk.tx_hashes.cbegin(), blk.tx_hashes.cend());

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be nice to have a check here that blk_entry.txs.size() == blk_tx_hashes.size() && blk_tx_hashes.size() == blk.tx_hashes.size()

This guarantees there aren't duplicates and that all blk_tx_hashes will map 1-to-1 with tx_entries. I can't find if this exact check is done somewhere else (probably is), but figure this would be a good early place for it anyway (either here or immediately after make_pool_supplement_from_block_entry inside try_add_next_blocks).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a check in make_pool_supplement_from_block_entry that all deserialized transactions belong to that block.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!blk_tx_hashes.count(tx_hash) in the make_pool_supplement_from_block_entry above this one checks that for all tx_entries, there's at least 1 matching block hash. Strictly going off that check (and ignoring all other code), it appears there could still be duplicates in this section's blk_entry.txs and blk.tx_hashes, and separately blk.tx_hashes could also have more hashes than are present in blk_entry.txs (which is the expected case when the make_pool_supplement_from_block_entry above handles tx_entries from a new fluffy block, not when syncing a block). In combination with the check you mentioned above, making sure all the container sizes are equal after constructing the set in this function immediately makes sure that when syncing a block, there aren't duplicate blk_entry.txs and that blk.tx_hashes captures all blk_entry.txs 1-to-1.

I don't see anything wrong with not doing the size check here, but it's a bit of a pain to verify there aren't issues surrounding this, and it seems an easy thing to check here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, yeah you were right, I mistakenly thought that making sure that each tx is bound to the block would prevent dups. Technically, this also doesn't check for dups See latest commit for update. We check that for all pool supllements that that the number of txs entries is less than or equal the number of hashes. For full blocks, we check that they are equal.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One edge case: if there's a dup in blk.tx_hashes, the equivalent dup in blk_entry.txs, and an extra hash in blk.tx_hashes, then the function would still return true with the dup included in the pool_supplement

Also checking blk_tx_hashes.size() == blk.tx_hashes.size() should prevent that

fullConnections.push_back({context.m_remote_address.get_zone(), context.m_connection_id});
}
LOG_DEBUG_CC(context, "PEER SUPPORTS FLUFFY BLOCKS - RELAYING THIN/COMPACT WHATEVER BLOCK");
fluffyConnections.push_back({context.m_remote_address.get_zone(), context.m_connection_id});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--no-fluffy-blocks meant a node wouldn't send fluffy blocks to its peers, not that a node wouldn't be relayed fluffy blocks from its peers (can quickly sanity check with monerod --no-fluffy-blocks --log-level 1 and see that new blocks are still received as fluffy blocks). Someone would have to manually build a v18 monerod that sets the final bit of m_support_flags to 0 in order to ask peers not to relay fluffy blocks to their node.

So to be clear, this PR as is shouldn't prevent current nodes on the network using any v18 release version of monerod from syncing/relaying/processing blocks, even nodes with the --no-fluffy-blocks flag (which still receive fluffy blocks today anyway).

Maybe the log could say "RELAYING FLUFFY BLOCK TO PEER" instead of "PEER SUPPORTS FLUFFY BLOCKS" because it's no longer checking if the peer supports fluffy blocks via the support_flags.

Copy link
Collaborator

@j-berman j-berman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been running the latest for weeks now, running smooth on my end.

I've also combed through these changes many times now -- thanks for your work on this.

Minor comments in this latest review round, I'm ready to approve to after this.

res = daemon2.get_transactions([txid])
assert len(res.txs) == 1
tx_details = res.txs[0]
assert not tx_details.in_pool
Copy link
Collaborator

@j-berman j-berman Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails sporadically on this line on my local. I investigated, it looks like an existing bug unrelated to this PR.

If the transfer includes an output in its ring that unlocked in block N, after popping blocks to N-2, that tx is no longer a valid tx because that output isn't unlocked yet (it fails here). You'd expect that once the chain advances in daemon2.generateblocks above, then the tx becomes valid again and should therefore be included in a later block upon advancing, but it looks like this if statement is incorrect:

//if we already failed on this height and id, skip actual ring signature check
if(txd.last_failed_id == m_blockchain.get_block_id_by_height(txd.last_failed_height))
return false;

And it should instead be:

//if we already failed on this height and id, skip actual ring signature check
if(txd.last_failed_id == m_blockchain.get_block_id_by_height(txd.last_failed_height) && txd.last_failed_height >= m_blockchain.get_current_blockchain_height())
  return false;

The ring sigs can become valid again if we're at a higher height than when the tx originally failed, so it should pass that if statement and continue on to the check_tx_inputs step again if so.

EDIT: slight edit to support a reorg making a tx valid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right about that check being wrong. However, even your proposed changes aren't conservative enough if you want to handle popped blocks: if the chain purely goes backwards in time (which only normally happens when pop_blocks is called), a transaction output with a custom unlock_time might actually UNLOCK. This is because Blockchain::get_adjusted_time() is not monotonic, so an output that is unlocked now may become locked again in a future block.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, so this should be good:

if(txd.last_failed_id == m_blockchain.get_block_id_by_height(txd.last_failed_height) && txd.last_failed_height == m_blockchain.get_current_blockchain_height()-1)

Copy link
Contributor Author

@jeffro256 jeffro256 Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that should be good. If we wanted to get incredibly pedantic, we would also have to check that the hard fork version is greater than or equal to HF_VERSION_DETERMINISTIC_UNLOCK_TIME, since your system's wall-time might also not be monotonic, and consensus validation of a tx with a ring containing an output with a UNIX-interpreted unlock_time isn't necessarily deterministic. But I don't think we should worry about that case.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your call

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should go with if(txd.last_failed_id == m_blockchain.get_block_id_by_height(txd.last_failed_height) && txd.last_failed_height == m_blockchain.get_current_blockchain_height()-1) IMO, since that wall-time thing won't affect any future or past transactions, it's only a technicality.

Copy link
Collaborator

@j-berman j-berman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving changes that include this commit: jeffro256@c388e12

Seems to be a github bug the PR doesn't include that commit

@selsta
Copy link
Collaborator

selsta commented Apr 18, 2024

I applied this pull request locally and comments 5 commit is missing... not sure what's going on

Copy link
Contributor

@vtnerd vtnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really close, but I had a questions, in tx_pool.cpp in particular.

@@ -5349,6 +5433,12 @@ void Blockchain::set_user_options(uint64_t maxthreads, bool sync_on_blocks, uint
m_max_prepare_blocks_threads = maxthreads;
}

void Blockchain::set_txpool_notify(TxpoolNotifyCallback&& notify)
{
std::lock_guard<decltype(m_txpool_notifier_mutex)> lg(m_txpool_notifier_mutex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might want to use boost::lock_guard instead, as we typically use boost for thread related things. I don't think it matters in this case; the suggestion is mostly for aesthetics/consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used std::lock_guard to not further cement Boost dependencies, and since std::mutex and std::lock_guard are already used within the codebase, I think it shouldn't affect binary size. However, I'm not incredibly opinionated either way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're already using it all over the place.

@@ -5367,6 +5457,22 @@ void Blockchain::add_miner_notify(MinerNotifyCallback&& notify)
}
}

void Blockchain::notify_txpool_event(std::vector<txpool_event>&& event)
{
std::lock_guard<decltype(m_txpool_notifier_mutex)> lg(m_txpool_notifier_mutex);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.


if(txd.last_failed_id != null_hash && m_blockchain.get_current_blockchain_height() > txd.last_failed_height && txd.last_failed_id == m_blockchain.get_block_id_by_height(txd.last_failed_height))
return false;//we already sure that this tx is broken for this height
if (txd.last_failed_id == top_block_hash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change is incorrect. You need something like:

  if (txd.last_failed_height && txd.last_failed_id == m_blockchain.get_block_id_by_height(txd.last_failed_height))
    return false;

The first check is needed because null_hash is technically a valid value (but exceptionally rare). I think the original code should've included this.

The txd.get_current_blockchain_height() > txd.last_failed_height check can probably be removed.

However, the last check is the most important - this appears to be tracking/caching whether a tx inputs are invalid after a certain height. Your change here will force a check_tx_inputs check every new block, instead of only after a reorg.

Copy link
Contributor Author

@jeffro256 jeffro256 Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your change here will force a check_tx_inputs check every new block, instead of only after a reorg.

Yes, this was the intended goal. The act of checking tx unlock times against a non-monotonic moving value of get_adjusted_time() makes it so that a transaction can pass check_tx_inputs at block X, but fail at block Y>X, and then re-pass at block Z>Y. This is discussed further at monero-project/meta#966 (comment).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think this section in this PR is good, and am for speeding it up in the general case in a future PR.

Behavior of the current code (excluding this PR):

For txs that are ready to go, it currently re-calls check_tx_inputs every time. There is no circumstance where it will short-circuit return true txs that should be ready to go.

For txs that are not ready to go, which should be an edge case minority of txs passed into this function, it makes an incorrect attempt at short-circuiting false. I say this is an edge case minority of txs because it would be a tx that was valid at one time that later became invalid, which should be rare (a reorg deep enough it would invalidate the ring signature, or unlock time reverts to locked).

Your change here will force a check_tx_inputs check every new block, instead of only after a reorg.

I agree that the check done in this PR could correctly short-circuit false in more circumstances, however, considering this should be a rare edge case, it's reasonable to argue this would be unnecessary error-prone complexity for this function. As such I'm good with this PR's approach as is.

I think it's also worth noting that we shouldn't have to run check_tx_inputs for txs that at one point were ready to go prior, so long as they were deemed ready to go on the same chain and don't reference any outputs with time-based unlock times. Aka there is a circumstance where we can short-circuit true that I think would significantly benefit this function in the general case. Considering this function impacts mining (see #8381), I think it's probably worth pursuing such a change in a future PR. It would be easiest to do with FCMP++ because there would be no need to deal with unlock time complexity.

@@ -1139,9 +1069,6 @@ namespace cryptonote

time_t start_time;

std::unordered_set<crypto::hash> bad_semantics_txes[2];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been around since 2017, and its removal isn't strictly needed for this PR. I would keep it, and analyze later in a separate PR.

The locations of the inserts/finds will have to move slightly, but it's still possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be harder to review that it is removed or that the new updates to the code are correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial thought would be that it is harder to review with it removed. I'd have to dig into why it was added to make sure that its removal isn't going to affect anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad_semantics_txes acts as an optimization for a happy-case failure where a bad transaction is being floated around, but not modified. I think that bad_semantics_txes maybe makes sense for handling individual mempool transactions, but not for transactions passed as part of a block. For two reasons: 1) we now do PoW verification for blocks before transactions, which makes the cache largely worthless, and 2) calling ver_consensus_non_input() on a pool_supplement_t verifies a group of transactions are all valid, so in order to restore the functionality of bad_semantics_txes for transactions passed in a block, we'd have to rewrite/review ver_consensus_non_input() being able to return exactly which transactions failed (which isn't always possible with batch verification).

@jeffro256 jeffro256 force-pushed the bc_sync_skip_mempool branch from 2e88523 to eeb5a06 Compare August 14, 2024 20:01
@selsta selsta added the daemon label Jan 28, 2025
@jeffro256 jeffro256 force-pushed the bc_sync_skip_mempool branch from be29fed to c6f2ccd Compare January 29, 2025 06:09
@jeffro256
Copy link
Contributor Author

Oops sorry for that latest push, I rolled back to c6f2ccd.

@jeffro256
Copy link
Contributor Author

This PR is ready for re-review

j-berman added a commit to j-berman/monero that referenced this pull request Feb 6, 2025
CRITICAL FIXME's:
- sum of inputs == sum of outputs (requires using an updated
rerandomized output and blinded C Blind at tx construction time)
- serialize tree root for rct_ver_cache_t (planning to use the
compressed tree root and then de-compress at verification time)

Planning to rebase onto monero-project#9135
@Gingeropolous
Copy link
Collaborator

Gingeropolous commented Feb 8, 2025

welp, the good news is that i got a node to sync with git pull origin pull/9135/head in it.

I've deleted my other notes, because as noted by others, I need to improve my testing setup to actually compare. So as of now, all I can confidently say is that I got the node to sync with 9135 pulled in.

@nahuhh
Copy link
Contributor

nahuhh commented Feb 8, 2025

If you add dynamic spans, make sure to also add dynamic block sync size (and manually increase speed limits)

@iamamyth
Copy link

iamamyth commented Feb 8, 2025

There are tons of bottlenecks in the synchronization process. If you want to see what effect this PR has on real disk activity (which still matters for many reasons, the most obvious being writes wear out the disk), there are many options:

  1. Dump the contents of /proc/<pid>/io for the daemon pre and post sync, and compare each version. You should hopefully see fewer bytes written in this branch.
  2. Use atop (must be installed first) with suitable flags to limit the output to your monerod process.
  3. Use iotop (must be installed first) in batch + accumulated mode.
  4. Use auditd to perform a detailed audit of IO activity, including operations such as fsync (this is a bit tricky because you need to configure auditd properly to limit the scope of audit activity, otherwise it might audit itself and flush each entry, effectively an infinite loop).

One caveat: For the test to be fair, you'd want the "end" for both to be a particular block/height. A simple option would be to feed the daemon under test from an exclusive node you control which has a snapshot of the blockchain up to a fixed point; it'll speed up the test and get at the relevant info.

@iamamyth
Copy link

iamamyth commented Feb 8, 2025

One other note: From the config you posted, I couldn't tell if your new daemon (labeled "syncing node") has both up and down equal, or just increases the default download rate; I would suggest making them symmetric.

@Gingeropolous
Copy link
Collaborator

Gingeropolous commented Feb 13, 2025

current master with 9135 pulled in is now running on xmrchain.net

(edited to add: I didn't perform an initial sync using 9135 on the explorer node, just recompiled and started wherever the node was. More of a stability test than actually testing the effects on IBD).

had uptime of 21 hours. Now running 9135 + 9765 on master on xmrchain.net

Copy link
Contributor

@vtnerd vtnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through yet again. Sorry two more questions! This is about ready either way.

reg_arg.b = b;
relay_block(reg_arg, context);
// Relay an empty block
arg.b.txs.clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change? It seems to differ than what the comments say above (that we only relay unknown txes whereas this is never relayed txes).

Copy link
Contributor Author

@jeffro256 jeffro256 Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. I think I wrote that comment before changing this part. There's a few valid schemes here with different trade-offs:

  1. Be optimistic and relay an empty block to peers, expecting that they will have the necessary transactions in their pool or already confirmed the block
  2. Be slightly cautious but mostly optimistic and put the transaction that you didn't know about alongside the block. If all nodes adopt this behavior, then in the happy case, it automatically converges on the behavior of scheme 1
  3. Be pessimistic and put all transactions in the block

Option 1 is fastest in the best case scenario. Option 3 is fastest in the worst case scenario. Option 2 was somewhere in the middle, and might be the best in an average case. I think I intended for scheme 2 to be the implemented one at first, but I opted out because it leaks information about the structure of the network, and I haven't done the research to know whether or not that's a reasonable privacy risk. At the end of the day, it's an opinionated decision, but they are all "correct"; all will eventually relay the block. Option 3 is decidedly the worst from an average case performance view, and is basically what we had before fluffy blocks. The fundamental assumption behind fluffy blocks in general is that your peer probably knows about the transactions in the block you're about to relay to them by the time you relay the block. All in all, I think default relaying empty blocks (Option 1) will be fine in most cases. It definitely gives miners a disadvantage who mine blocks with transactions that break majority-held relay rules, which can be a good thing or a bad thing depending on your view.

But yeah, if we stick with Option 1, I will need to amend that comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you didn't know about transaction(s), there's a decent chance your peer didn't either. And I'm not sure the privacy leak argument is very strong - you still have to ask a peer for missing transactions anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you didn't know about transaction(s), there's a decent chance your peer didn't either.

I agree

And I'm not sure the privacy leak argument is very strong - you still have to ask a peer for missing transactions anyway.

But that would be a privacy leak for the node that you don't control, whereas choosing what you relay leaks something about your node to someone else's. As for whether or not this is okay, what about the scenario that a node in the stem phase of transaction propagation (or stem adjacent) mines that transaction and propagates the block before the blackhole period is over? I think the reference code doesn't include such transactions in the block template, but I could be wrong. And if there is an alternative buggy implementation that does include such transactions in blocks, couldn't Scheme 2 propagate the result of this bug farther to spy nodes than if reference nodes went within Scheme 1?

At the end of the day, I'm not certain that this isn't a privacy leak, so personally I'd rather error on the side of caution, but I'm definitely open to changing my mind.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue is probably tied to #9334

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for now. We can always tweak shortly anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nah has tx a,b,c,d,e
vtnerd has tx a,b,c,d
jeffro has tx a,b,c

nah mines the block (with tx a,b,c,d,e), and sends to vtnerd, who sends to jeffro

How would each scenario play out? Whats the worst case situation for each?

  1. do tx d+e get relayed from nah to vtnerd? what does vtnerd send to jeffro?
  2. vtnerd leaks to jeffro that he didnt have e? But what about jeffro who is missing d+e?
  3. abcde are all sent from nah to vtnerd to jeffro, even if jeffro already has them

Copy link
Contributor

@vtnerd vtnerd Feb 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. vtnerd doesn't send the fluffy block to jeffro until e is received (probably from nah). In current code, vtnerd sends e to jeffro as an additional tx, but in this PR nothing gets sent. The current code leaks to jeffro that vtnerd didn't know about e.
  2. Correct (until this patch, which changes that behavior). jeffro has to ask vtnerd about d+e in either scenario.
  3. This never happens.

This PR arguably has less leaks. There's also the case where a node pretends to not know about a tx, based on settings and get_tx parameters.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into how often NOTIFY_REQUEST_FLUFFY_MISSING_TX will appear in node logs when the log level is net.p2p.msg:INFO. I re-analyzed some node log data that was collected last year and described on page 20 of version 0.3 of "March 2024 Suspected Black Marble Flooding Against Monero: Privacy, User Experience, and Countermeasures".

Using logs from nodes that accepted incoming connections, I found that for a given p2p connection, the NOTIFY_REQUEST_FLUFFY_MISSING_TX message occurs in 0.7 percent (i.e. less than one percent) of blocks. The message is correlated with time: if one connection emits the message for a particular block, then other connections are more likely to emit the message. This makes sense because the message would tend to be emitted when a transaction was confirmed in a mined block before it could be propagated throughout the network. In that event, multiple nodes would need to emit the message for the same block.

Probably, this event is rare because transactions propagate throughout the network sooner than they are added to mining pools' block template. When I last measured transaction propagation times two years ago, the median time to propagate throughout the network was two seconds during the Dandelion++ fluff phase. On the other hand, most mining pools add new transactions to their block templates once every 10-30 seconds.

Those were the statistics under current network conditions. If transaction volume is so high that the txpool becomes congested, we would probably expect that the message is emitted even less frequently. The default behavior of monerod's block template construction is to first order the transactions by fee (there are 4 standard tiers) and then order them by first-seen. Therefore, conditional on fee (which is usually set automatically to the same tier for everyone), transactions are first-in/first-out. The transactions that have been broadcasted first and have been waiting a while are the transactions that would be confirmed in the next block, so it is unlikely that a node would be missing them.

return 1;
}
}
MERROR("sent bad block entry: there are duplicate tx hashes in parsed block: "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you still need:

if (context.m_requested_objects.size() != arg.b.txs.size())
{ 
  // error
}

along with a check to ensure that every requested txid was provided by the peer.
?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm like 80% sure that m_requested_objects is only relevant for the NOTIFY_REQUEST_GET_OBJECTS/NOTIFY_RESPONSE_GET_OBJECTS commands. When a node is missing txs in a fluffy block, they send a NOTIFY_REQUEST_FLUFFY_MISSING_TX command, which is responded to immediately with a new fluffy block containing the missing transactions. But these requests/responses don't touch m_requested_objects, which is for persistent, cross-command, state of block hashes.

Copy link
Contributor

@vtnerd vtnerd Feb 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at our existing code + the history of FLUFFY_BLOCKS. This code you removed appears to be dead code. However, we may want to consider adding the constraints intended ... another patch? This one has enough already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this comment has the right idea:

// hijacking m_requested objects in connection context to patch up
. But I don't think it actually did anything, since we didn't set the size of m_requested_objects anywhere in this flow. If anything, it would just break honest syncing if that connection also sent us a fluffy block. Well, maybe not since we return early from handling fluffy blocks if we're in the synchronizing state, so yeah it's probably just dead code that doesn't do anything.

@jeffro256
Copy link
Contributor Author

Currently on c6f2ccd. I will amend the erroneous comment, then squash.

@jeffro256 jeffro256 force-pushed the bc_sync_skip_mempool branch from c6f2ccd to bbe0dd6 Compare February 18, 2025 21:04
@Gingeropolous
Copy link
Collaborator

So I was running 9135 (and the http max thing) on master and it "aborted" and i got this: "corrupted size vs. prev_size". This is on the same box that I had the issue above, however its on a different HDD on that box. So the database is now being stored on that secondary hard drive.

@iamamyth
Copy link

iamamyth commented Feb 20, 2025

The failing functional test is a problem with the commit (it replicates old, broken patterns in new places), rather than an old merge base, see comments here: #9740.

@jeffro256
Copy link
Contributor Author

So I was running 9135 (and the http max thing) on master and it "aborted" and i got this: "corrupted size vs. prev_size". This is on the same box that I had the issue above, however its on a different HDD on that box. So the database is now being stored on that secondary hard drive.

Would it be possible to run a memtest on that machine? That error can be caused by a programming bug in monerod or it could be caused by heap corruption due to bad physical memory. I know that you've had other corruption issues recently, and IIRC @Rucknium has said before that one of the MRL research machines has already replaced bad RAM sticks in the past. So if you would check to see if it's a hardware issue, that would be greatly appreciated.

* Speed up propagation polling
* Remove duplicated "reorg" testing which doesn't give enough time for txs to propagate
* Test two different types of block propagations: shared tx, and new tx
@Gingeropolous
Copy link
Collaborator

@jeffro256 , this is being run on my seed node, which is a remote box unrelated to the research cluster. To your point though, I don't have confidence in the seed node hardware. I want to setup a new box with a HDD in the lab to test this patch, because this current experience hasn't been clear. I just need to squeeze some more time from the ol' time fruit.

Copy link
Collaborator

@j-berman j-berman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good. Some comments worth implementing imo, and some nits. Feel free to ignore the comments prefaced with "nit"

std::vector<cryptonote::blobdata> tx_blobs;
std::vector<crypto::hash> missed_txs;

bool need_tx = !m_core.pool_has_tx(tx_hash);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we can theoretically not have a tx in the pool when handle_single_incoming_block executes, then once it returns, receive a tx from another connection, then get to this point and already have the tx. Thus need_tx_indices can end up empty.

It's not an issue because handle_request_fluffy_missing_tx will still return a fluffy block even for an empty request for txs.

Might be cleaner to return the missing txs in handle_single_incoming_block instead. Not a blocker for this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not an issue because handle_request_fluffy_missing_tx will still return a fluffy block even for an empty request for txs.

This might be the "better" option in the sense that this automatically makes ourselves re-encounter the fluffy block, even if another connection doesn't pass it to us in the future.

A more "optimal" solution would be caching blocks which pass PoW verification but are missing txs, and then triggering a re-verify of these blocks when the mempool is updated.

std::vector<crypto::hash> missed_txs;

bool need_tx = !m_core.pool_has_tx(tx_hash);
need_tx = need_tx && (!m_core.get_transactions({tx_hash}, tx_blobs, missed_txs, /*pruned=*/true)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: m_core.get_transactions could be replaced by m_core.get_blockchain_storage().have_tx(tx_hash) here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it could, but personally don't like that this would be the first time in the cryptonote protocol handler that the blockchain storage is exposed. We could add a core endpoint though ..

*
* @return false if any outputs do not conform, otherwise true
*/
bool check_tx_outputs(const transaction& tx, tx_verification_context &tvc) const;
static bool check_tx_outputs(const transaction& tx,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: for a future PR, I would move this function into tx_verification_utils. Makes sense not to do it here to keep the diff smaller.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, yeah was just trying to minimize the diff as it's already huge. This pure function is ripe for relocation afterwards, though.

const std::uint8_t hf_version)
{
// We already verified the pool supplement for this hard fork version! Yippee!
if (ps.nic_verified_hf_version == hf_version)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I'm not seeing how to trigger this if statement on re-review. Looks like the pool supplement is only used once and then discarded in all cases. Am I missing something there?

Doesn't look like an issue to me being here, just a little confusing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope not missing anything AFAIK, could be worth removing


if(txd.last_failed_id != null_hash && m_blockchain.get_current_blockchain_height() > txd.last_failed_height && txd.last_failed_id == m_blockchain.get_block_id_by_height(txd.last_failed_height))
return false;//we already sure that this tx is broken for this height
if (txd.last_failed_id == top_block_hash)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think this section in this PR is good, and am for speeding it up in the general case in a future PR.

Behavior of the current code (excluding this PR):

For txs that are ready to go, it currently re-calls check_tx_inputs every time. There is no circumstance where it will short-circuit return true txs that should be ready to go.

For txs that are not ready to go, which should be an edge case minority of txs passed into this function, it makes an incorrect attempt at short-circuiting false. I say this is an edge case minority of txs because it would be a tx that was valid at one time that later became invalid, which should be rare (a reorg deep enough it would invalidate the ring signature, or unlock time reverts to locked).

Your change here will force a check_tx_inputs check every new block, instead of only after a reorg.

I agree that the check done in this PR could correctly short-circuit false in more circumstances, however, considering this should be a rare edge case, it's reasonable to argue this would be unnecessary error-prone complexity for this function. As such I'm good with this PR's approach as is.

I think it's also worth noting that we shouldn't have to run check_tx_inputs for txs that at one point were ready to go prior, so long as they were deemed ready to go on the same chain and don't reference any outputs with time-based unlock times. Aka there is a circumstance where we can short-circuit true that I think would significantly benefit this function in the general case. Considering this function impacts mining (see #8381), I think it's probably worth pursuing such a change in a future PR. It would be easiest to do with FCMP++ because there would be no need to deal with unlock time complexity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.