-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pull in transaction receipts only when necessary #1308
Conversation
if block_ptr_from.number != block_ptr_to.number - 1 { | ||
panic!("transact_block_operations must transact a single block only"); | ||
} | ||
let block_ptr_from = self.block_ptr(subgraph_id.clone())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am all for getting rid of set_block_ptr_with_no_changes
, but I don't like that this change causes a query for data the caller already has. We should continue to pass block_ptr_from
into transact_block_operations
but change the rules for it: if block_ptr_from + 1 < block_ptr_to
, mods
has to be empty, and we must always have block_ptr_from < block_ptr_to
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's interesting observation that at some earlier point the block stream called made this same DB lookup, and we could plumb that through in order to save this call. But if we wanted to keep block_ptr
cached in memory for performance, I'd approach that differently by having the store keep that cache without complicating other code. However this query should be cheap enough to do in both places, if it's currently slow that's because we're pulling the entire subgraph only to read the block pointer, which is an issue regardless of this change.
I can assert that block_ptr_from < block_ptr_to
, though I don't follow why if block_ptr_from + 1 < block_ptr_to
then mods
must be empty, that's not the case in this PR, is that a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I measured the speed of this query for the Moloch subgraph, it never took more than 1 ms, so we're good on that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I was getting at is that current callers have block_ptr_from
already, so why not continue to pass that in rather than look it up? Yes, it's a very cheap lookup, at least on an empty, uncontested DB, but I'd rather not query at all when it's not necessary.
As for empty mods
, I was going by what the current set_block_ptr_with_no_changes
does; what is in mods
when we are jumping more than one block? I'd like to assert/check something about them; part of the reason for this is #1093 where we never got to the bottom of why we trip over a metadata update, even though that should be impossible, and I'd like to make sure we are not just silently overwriting stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The check we had transact_block_operations
caught bugs in my early block explorer data work, so I second the request for keeping the from pointer and performing the sanity checks that @lutter is recommending.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The caller doesn't have it. With these simplifications to no longer enforce a step of 1, I don't even know how to keep track of the block pointer in memory. In our current abstraction the instance manager can't tell if it was the last one to change the pointer by moving it forward, or if it was the block stream with a revert, only the DB knows.
This block_ptr_from
value is only used for asserts and logs, in fact it's the only reason we have the weird metadata guard thing, there's more to simplify here but that can be done as a follow up. If we agree this PR is mostly an improvement, I'd prefer doing further improvements after merging this.
No matter how many blocks we're skipping, mods
contains the modifications for the block being applied. I've added the block_ptr_from.number < block_ptr_to.number
assert, I'm happy to add any asserts you think are worthwhile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realize that the caller doesn't have an easy way to get at block_ptr_from
; in that case, I am fine with the change. I would just really like to figure out what caused #1093
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As am I (fine with the changes, that is).
"Found {} relevant block(s)", | ||
descendant_ptrs.len() | ||
logger, | ||
"Found {} trigger(s)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's useful here to log the number of blocks, don't we log triggers farther down the pipeline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, seems I got confused here. I've moved this log into triggers_in_block
and it now logs the number of blocks again.
parent_ptr, | ||
Box::new( | ||
self.eth_adapter | ||
.load_block(&ctx.logger, subgraph_ptr.hash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious why expect this to succeed when we already checked that this block has been uncled, do ethereum node reliably maintain uncles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't change this code, the diff is from formatting churn. This relies on the block ingestor always pulling in the blocks within the reorg threshold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does. I'm not sure Ethereum nodes keep everything about uncles available, like transaction receipts, but the block headers are stored on chain. This is necessary for e.g. applying and verifying uncle rewards.
// Yield one block | ||
Ok(Async::Ready(Some(next_block))) => { | ||
Some(next_block) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's odd to push blocks without triggers to the subgraph instance, I think it's going to produce metrics observations of 0
for the triggers in block histogram and logs which say something like 0 triggers in blocks
.
At a higher level, I like that that block stream would advance the subgraph pointer in this case even though at a lower level it adds more paths in the block streams code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that a block with zero triggers is still a sort of special case, but it already happens in the current code for non-final blocks and so far it's a special case that requires less complexity, the cases you mention with logs and metrics are easy to handle. I'm not fundamentally opposed to keeping set_block_ptr_with_no_changes
, though I'm personally not a fan and @lutter also found it annoying, I'd be interested to know what @Jannis thinks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of skipping irrelevant blocks more efficiently but I think set_block_ptr_with_no_changes
unnecessarily triggered the logic that checks if subscriptions need to be updated? I'm all for dumping it if it doesn't have a detrimental impact on indexing performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to give the block stream a property of only emitting relevant blocks (with triggers) to the subgraph instance? It seems cleaner that way to me. If we don't want that or we don't want that in this PR, we should make sure to filter which blocks get logged and added to metrics on the subgraph instance side
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just remembered another reason we had this: to get more frequent updates on indexing progress for any subgraph while scanning historic blocks. That is an important feature to preserve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind the 0 triggers log, during syncing most subgraphs will log that only once every 10k blocks. When synced, it seems relevant to log how many triggers are found on each new block, even if 0. If this is not helpful for the metrics, I can filter out.
I spent some time measuring performance vs master, I used Moloch because it has few events so it's seems like a good one for measuring overhead per processing cycle. Sometimes master was faster, sometimes the branch was faster, so any difference seems to be in the noise. Afaict the block stream is network bound once the blocks are cached, and this PR didn't change the rpc calls done in that scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, makes sense re: logs. Let's just filter out the zero trigger blocks from metrics then
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@olibearo Done, let me know if any other metrics need to check for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed a bit more than 2/3 of the PR. I like most changes a lot. The things I still need to think about a bit are ThinEthereumBlock
and BlockFinality
and how they are used. Will do that tomorrow.
@@ -162,6 +167,7 @@ impl SubgraphInstanceManager { | |||
pub fn new<B, S, M>( | |||
logger_factory: &LoggerFactory, | |||
stores: HashMap<String, Arc<S>>, | |||
eth_adapters: HashMap<String, Arc<dyn EthereumAdapter>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One concern I have here is that we're integrating Ethereum more tightly into subgraph indexing. That's ok for now though.
// Yield one block | ||
Ok(Async::Ready(Some(next_block))) => { | ||
Some(next_block) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of skipping irrelevant blocks more efficiently but I think set_block_ptr_with_no_changes
unnecessarily triggered the logic that checks if subscriptions need to be updated? I'm all for dumping it if it doesn't have a detrimental impact on indexing performance.
self.block_hash_by_block_number(&logger, to) | ||
.map(move |block_hash_opt| EthereumBlockPointer { | ||
hash: block_hash_opt.unwrap(), | ||
number: to, | ||
}) | ||
.and_then(move |block_pointer| { | ||
stream::unfold(block_pointer, move |descendant_block_pointer| { | ||
if descendant_block_pointer.number < from { | ||
return None; | ||
} | ||
// Populate the parent block pointer | ||
Some( | ||
eth.block_parent_hash(&logger, descendant_block_pointer.hash) | ||
.map(move |block_hash_opt| { | ||
let parent_block_pointer = EthereumBlockPointer { | ||
hash: block_hash_opt.unwrap(), | ||
number: descendant_block_pointer.number - 1, | ||
}; | ||
(descendant_block_pointer, parent_block_pointer) | ||
}), | ||
) | ||
}) | ||
.collect() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How large are the block ranges we'd pass in here? While simple and easy to follow, populating the vector one item at a time could be slow — a more performant (but more complicated) approach would be to optimistically resolve all hashes for the (to, from)
range in parallel (or in parallel batches) and checking them for consistency afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is slow and makes the performance of unconditional block triggers suck. I should improve this by doing your suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've improved this to fetch the block hashes in parallel, and refactored the code that does regular block fetching so that they're similar. I also removed the special case for the scan range of block triggers, this helps a subgraph like Betoken to be less slow, though the situation is still not ideal because we don't cache the number -> hash association in the database. Consistency checking is not necessary since this range is for final blocks.
)); | ||
} | ||
|
||
match block_filter.contract_addresses.len() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put this one right next to the block_filter.trigger_every_block
condition above, perhaps even in an else
branch to make it more obvious that the block filter is only applied once, depending on which type of block filter it is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea, I've refactored this a bit.
}), | ||
) | ||
as Box<dyn Future<Item = _, Error = _> + Send>, | ||
BlockFinality::NonFinal(fat_block) => Box::new(future::ok({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we rename fat_block
to full_block
, since that's more in line with e.g. load_full_block
etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks
Some comment threads are still on going but I've responded to everything. I did a sample of the size of an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a typo in one of the commit messages:
ethereum: Rename methdos
work in progress.
`blocks_with_triggers` returns `EthereumBlockWithTriggers`s. Remove the check for an empty scan and the `AdvanceToDescendantBlock` state. Still WIP>
Skipping a blocks happens implicitly by not processing those blocks.
And move the log to `triggers_in_block`.
And refactor a bit to make loading by hash and by number use similar code.
They are not that special, we can the a normal range.
642f16a
to
caf89a5
Compare
@Jannis rebased and fixed the message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I am concerned, the changes and post-review updates look good to me. I'd be happy to merge this.
parent_ptr, | ||
Box::new( | ||
self.eth_adapter | ||
.load_block(&ctx.logger, subgraph_ptr.hash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does. I'm not sure Ethereum nodes keep everything about uncles available, like transaction receipts, but the block headers are stored on chain. This is necessary for e.g. applying and verifying uncle rewards.
if block_ptr_from.number != block_ptr_to.number - 1 { | ||
panic!("transact_block_operations must transact a single block only"); | ||
} | ||
let block_ptr_from = self.block_ptr(subgraph_id.clone())?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As am I (fine with the changes, that is).
When processing blocks considered final, we fetch the triggers only to drop them and re-process them from the transaction receipts. This is a wasteful process, particularly of disk space because we cache hundreds of GBs worth of transaction receipts we don't really need.
This PR makes it so that we only request and store transaction receipts when they are actually necessary, which is when processing blocks that may still be reorged.
blocks_with_triggers
now returns richer information instead of dropping it, the return type was changed fromEthereumBlockPointer
toEthereumBlockWithTriggers
.One thing that makes this change tricky is that we may need to re-process a block for dynamic data sources, at which point we again need to know if the block is final or not, and need extra information for non-final blocks. This is encoded in the
BlockFinality
enum and the trigger re-processing is encapsulatedtriggers_in_block
. Some code got moved from the block stream to the ethereum adapter so that it could be shared between the block stream and the instance manager.Since this touched some of the block stream, I took the opportunity to refactor it a bit. We no longer have special logic for advancing through an empty range, instead we insert a dummy block for
to
with no triggers, and use the normal processing logic to advance. The special casing was in some sense an optimization, but it was optimizing the fastest possible case which is an empty range, so it was considerable complexity for no real gain. In doing this,transact_block_operations
got a bit simpler, it was only used to advance one block a time when it's perfectly capable of skipping blocks. Also it no longer takes ablock_ptr_from
, it doesn't make sense for that to be anything other than the current block pointer, so we just assume that. Also, some of the types in the block stream state machine got simplified a bit.Sorry if this is a lot in one PR! Please ask if some change needs clarification or more comments. I tested this with Moloch and Betoken, both synced fine. I avoided any non-trivial change to reversion code, any changes there are just formatting and minor refactors.