Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core/txpool: remove "local" notion from the txpool price heap #21478

Merged
merged 10 commits into from
Dec 11, 2020

Conversation

rjl493456442
Copy link
Member

@rjl493456442 rjl493456442 commented Aug 24, 2020

Fixes #21174

Master

goos: darwin
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core
BenchmarkInsertRemoteWithAllLocals
BenchmarkInsertRemoteWithAllLocals-4   	       1	5516987671 ns/op
PASS
PR

goos: darwin
goarch: amd64
pkg: github.com/ethereum/go-ethereum/core
BenchmarkInsertRemoteWithAllLocals
BenchmarkInsertRemoteWithAllLocals-4   	      25	  45455161 ns/op
PASS

core/tx_list.go Outdated
l.stales--
continue
}
// Stop the discards if we've reached the threshold
if tx.GasPriceIntCmp(threshold) >= 0 {
save = append(save, tx)
heap.Push(l.remotes, tx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it makes any difference, but might be worth benchmarking: doing a heap.Peek instead of Pop + Push. ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I will do a benchmark for it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, heap has no Peak function...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right... I saw another place that this (guerilla peek) trick is used:

		tx := []*types.Transaction(*l.remotes)[0]

I tested it, seems to be maybe a little faster, not sure if it's worth the extra complexity...

Copy link
Member

@karalabe karalabe Nov 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guerilla peek

That's not a trick actually, that's a guaranteed API (edit, actually, that's a guaranteed heap data structure concept, no a Go specific thing), alas looks a bit odd:

The minimum element in the tree is the root, at index 0.
https://golang.org/pkg/container/heap/

Re Pop/Push vs Peek, each Pop and Push on a heap is log(n). So if we can peek, it's always much better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

core/tx_list.go Outdated Show resolved Hide resolved
core/tx_pool.go Outdated Show resolved Hide resolved
Copy link
Contributor

@holiman holiman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM

core/tx_pool.go Outdated Show resolved Hide resolved
core/tx_list.go Outdated
l.stales--
continue
}
// Stop the discards if we've reached the threshold
if tx.GasPriceIntCmp(threshold) >= 0 {
save = append(save, tx)
heap.Push(l.remotes, tx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right... I saw another place that this (guerilla peek) trick is used:

		tx := []*types.Transaction(*l.remotes)[0]

I tested it, seems to be maybe a little faster, not sure if it's worth the extra complexity...

@holiman
Copy link
Contributor

holiman commented Sep 28, 2020

Some more results

name                         old time/op  new time/op  delta
PoolBatchInsert100-6         8.42ms ± 3%  8.53ms ± 6%     ~     (p=0.841 n=5+5)
PoolBatchInsert1000-6        98.0ms ± 1%  97.8ms ± 6%     ~     (p=0.690 n=5+5)
PoolBatchInsert10000-6        1.13s ± 2%   1.12s ± 0%     ~     (p=0.190 n=5+4)
PoolBatchLocalInsert100-6    31.1ms ± 7%  10.5ms ± 6%  -66.09%  (p=0.008 n=5+5)
PoolBatchLocalInsert1000-6    399ms ±35%   104ms ± 5%  -73.94%  (p=0.008 n=5+5)
PoolBatchLocalInsert10000-6   3.09s ± 4%   1.23s ± 3%  -60.30%  (p=0.008 n=5+5)

@holiman
Copy link
Contributor

holiman commented Sep 28, 2020

The title says WIP -- is this still WIP or do you consider it fit for review/merge?

@rjl493456442
Copy link
Member Author

@holiman It's basically done and wait for review. I can address your comment and remove the WIP

@rjl493456442 rjl493456442 changed the title [WIP]: remove "local" notion from the txpool price heap core/txpool: remove "local" notion from the txpool price heap Sep 29, 2020
core/tx_list.go Outdated Show resolved Hide resolved
log.Error("Pricing query for empty pool") // This cannot happen, print to catch programming errors
return false
if len(*l.remotes) == 0 {
return false // There is no remote transaction at all.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, there was a comment saying that this couldn't happen. Is that still true?

  • If not (and this can now legitimately happen, is it correct to return 'false'?
  • If yes, then please add back the log.Error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning false is correct in case of an empty pool because it just states that "yes, this transaction is priced correctly compared to everything else (none)". That said, I'm unsure why we'd start calling it on an empty pool.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the empty remote set it's possible. All the transactions in the pool are locals.

slots -= numSlots(tx)
}
// If we still can't make enough room for the new transaction
if slots > 0 && !force {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unsure of what force means, thus I'm not sure if !force is correct, or if it should be force. Could you please document what force means here?

This looks a bit wrong to me. Essentially, if we don't manage to free up slots slots, then you add back all the ones we found could be dropped?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a special case: the txpool is full of local transactions. However we still want to make a room for the newly arrived local transaction.

In this case, If we only have a part of the slot can be released, these slots are released forcibly(even the slots is 0)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit wrong to me. Essentially, if we don't manage to free up slots slots, then you add back all the ones we found could be dropped?

Yes it the logic for the remote transactions. But I give some special care to the local transactions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmmm, re force + readds, this is interesting.

Previously we've assumed that discard can always drop the needed number of transactions. IMHO this is a bug, made a bit worse by the slot rewrite:

  • If we have local and remote transactions, it can always happen that we cannot discard the requested number of slots simply because there's not enough remote slots left. In that case the old code just discarded as many as it could (potentially 0) and called it a day, letting the adder inters the tx afterwards, overflowing the pool.
  • Without a slot rewrite, we could end up with 1 slot overflowing. With the rewrite however, it's imaginable that we have a multi-slot transaction (e.g. 4) that evacuates multiple-but-not-enough slots (e.g. 3). In that case, we also introduce a weird churn on the pool.

I'm a bit unsure of what force means

Based on my understanding, you useforce as local flag. I.e. if the tx is local, then try to make as much space as possible, but even if we only have 3 remote slots and need 4, drop the 3 rather than keep then in. Local tx always fit, remote ones observe the pool limits.

core/tx_pool.go Outdated Show resolved Hide resolved
core/tx_list.go Outdated Show resolved Hide resolved
core/tx_list.go Outdated Show resolved Hide resolved
core/tx_pool.go Outdated
// Otherwise if we can't make enough room for new one, abort the operation.
drop, success := pool.priced.Discard(pool.all.Slots()-int(pool.config.GlobalSlots+pool.config.GlobalQueue)+numSlots(tx), isLocal)

// Special case, We still can't make the room for the new remote one.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We -> we

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, @holiman has thoughts about it. Should we make the room for the local transactions even the pool is "almost" or "aready" full?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not super sure, would be nice to get some options from you

core/tx_pool.go Outdated
// If the transaction fails basic validation, discard it
if err := pool.validateTx(tx, local); err != nil {
if err := pool.validateTx(tx, isLocal); err != nil {
log.Trace("Discarding invalid transaction", "hash", hash, "err", err)
invalidTxMeter.Mark(1)
return false, err
}
// If the transaction pool is full, discard underpriced transactions
if uint64(pool.all.Count()) >= pool.config.GlobalSlots+pool.config.GlobalQueue {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this check is wrong. The point of this was to check if the pool is full, and if so, drop a number of txs to make sure the new one fits. This worked well before the slot rewrite because the only way for a new tx not to fit is if the pool was exactly at capacity (or better put pool.all.Count()+1) > pool.config.GlobalSlots+pool.config.GlobalQueue). With the slot rewrite, I think we should have updated this part too to check pool.all.Count()+NUMSLOTS) > pool.config.GlobalSlots+pool.config.GlobalQueue, since if the pool has 1 slot open and we need 4, we still need to drop 3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The situation is not overly bad, because once we overflown by 1, it self-heals because below we had the forethought to discard based on current size, not on theoretical counts. Still, would be nice to fix the inaccuracy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup

core/tx_pool.go Outdated Show resolved Hide resolved
core/tx_pool.go Outdated Show resolved Hide resolved
core/tx_pool.go Outdated
tx, ok = t.remotes[hash]
}
if !ok {
return // transaction not found
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should also add a log.Error here. If we call Remove, we really strongly believe it should exist AFAIK, might be a good idea to print something otherwise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, previously we panicked in that case, so we should either panic, or definitely retain the error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@fjl fjl added this to the 1.10.0 milestone Dec 8, 2020
@fjl fjl removed the status:triage label Dec 8, 2020
@holiman holiman merged commit 88c6962 into ethereum:master Dec 11, 2020
gzliudan pushed a commit to gzliudan/XDPoSChain that referenced this pull request May 14, 2024
…um#21478)

* core: separate the local notion from the pricedHeap

* core: add benchmarks

* core: improve tests

* core: address comments

* core: degrade the panic to error message

* core: fix typo

* core: address comments

* core: address comment

* core: use PEAK instead of POP

* core: address comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature: Separate local tx tracking from the priced field in the pool
5 participants