core/state: improve update times #27

holiman · 2020-02-05T12:14:40Z

Work in progress to improve the account_update / storage_update.

Background: the chart below shows switching from snapshot-5 to master, and how the account_update and storage_update eats up the gains we get from the account/storage-read operations.

holiman · 2020-02-05T15:45:32Z

Benchmarking: https://geth-bench.ethdevops.io/d/Jpk-Be5Wk/dual-geth?orgId=1&var-exp=mon08&var-master=mon09&var-percentile=50&from=1580917100649&to=now

"master" = snapshot-5
"exp" = this PR

Feb 05 16:38:01 mon08.ethdevops.io geth INFO [02-05|15:38:01.543] Starting peer-to-peer node instance=Geth/v1.9.11-unstable-075e621a-20200205/linux-amd64/go1.13.7
Feb 05 16:38:09 mon09.ethdevops.io geth INFO [02-05|15:38:09.261] Starting peer-to-peer node instance=Geth/v1.9.11-unstable-4eeafc17-20200205/linux-amd64/go1.13.7

Currently doing a fast-sync, so we can later compare the times for account and storage updates.

holiman · 2020-02-06T10:00:58Z

restarted

Feb 06 10:54:33 mon08.ethdevops.io geth INFO [02-06|09:54:33.331] Starting peer-to-peer node instance=Geth/v1.9.11-unstable-4695d38b-20200206/linux-amd64/go1.13.7
Feb 06 10:55:06 mon09.ethdevops.io geth INFO [02-06|09:55:06.622] Starting peer-to-peer node instance=Geth/v1.9.11-unstable-9ad532f9-20200206/linux-amd64/go1.13.7

holiman · 2020-02-07T16:36:33Z

Trying out a wild idea. Updated the machines:

Feb 07 17:34:38 mon08.ethdevops.io geth INFO [02-07|16:34:38.293] Starting peer-to-peer node instance=Geth/v1.9.11-unstable-89416e1a-20200207/linux-amd64/go1.13.7
Feb 07 17:35:31 mon09.ethdevops.io geth INFO [02-06|09:55:06.622] Starting peer-to-peer node instance=Geth/v1.9.11-unstable-9ad532f9-20200206/linux-amd64/go1.13.7

holiman · 2020-02-07T16:49:26Z

The change in 89416e1 is a test which, if it works, should improve account_update (the red part of the charts).
Restarting the snapshot-5-(master) machine, this is the graph:

Before the interruption, the account_update sits at 6.6ms, and restart bumps it and after a while it resumrs 6.2ms

This pr-machine is at 6.9ms before restart, the restart "bumps" it only to 5.9ms, and later it lands on 4.5ms.

Next up: I'll try the same trick for storage.

holiman · 2020-02-07T17:42:59Z

Here's the gist of it. First of all, scrap the existing precacher.

Background:

When we execute block n , we execute transactions serially.
After tx 1, we call Finalize, which puts the address for each pending object into pending.
Similarly, we for each of those, we call obj.Finalize, which does a similar thing for each modified slot.

Idea

So, during finalize, we also send each addr off onto a channel (non-blocking). And on that channel, there's a little
precacher-thread which loads it from the trie and warms up the cache.

So the snapshotter allows us to parallelize the trie-reads instead of doing it serially. At the end of the block, when we finally reach the update-phase, the trie caches will be prepopulated already, and the update will be quick.

holiman · 2020-02-09T10:05:04Z

The docker stop timeout of 60s might be insufficient, one of them got corrupted and I had to resync. After both are in sync again, and snapshot generation complete, snapshot-5 on mon09 is slightly faster overall.

However, here's the account_update/storage_update chart for snapshot-5:

The corresponding one for this pr is a couple of milliseconds better:

mon09 seems a bit slower than mon08 in general, so I think the actual performance of this PR is a bit better than the charts show.

The new prefetcher seems ot be loading about 50 entries per second, which is ~700 items per block, which seems a bit on the low side -- we're probably missing a lot of things due to async sends.

holiman · 2020-02-11T08:45:54Z

After sync-up, the mon08 (this PR) is slighly faster than mon09.

The account_update and storage_updates:

Also, looking at the clean cache hits, one can see that mon08, in yellow, consistently has a higher hitrate:

holiman · 2020-02-11T09:13:07Z

Oh, this is a nice graph aswell:

holiman · 2020-02-13T17:18:03Z

I am now testing to swap the nodes around, so this PR runs on mon09 and snapshot-5 on mon08

holiman · 2020-02-13T17:53:02Z

Ok, here are some stats from switching them

The upper one is first snapshot-5, then this PR was swapped in. The storage/account update time went from 15.5ms down to 11.1.

The lower one was this PR, then snapshot-5 swapped in. The storage account update time went up from 12.9 to 16.4ms.

TLDR; this PR makes account/storage updates 27-39 % faster.

holiman · 2020-02-13T18:29:33Z

Interestingly, the dirty cache hit rate is better with this PR. I have no idea why:

This makes eth_call and eth_estimateGas use the zero address as sender when the "from" parameter is not supplied. Co-authored-by: Felix Lange <[email protected]>

* les: separate peer into clientPeer and serverPeer * les: address comments

This was missing because I forgot to wrap it when bind.CallOpts.From as added.

karalabe · 2020-02-27T13:13:40Z

The measurements of this PR vs without is not comparable because you modified the metric measuring account/storage updates. That change was never upstreamed into the original PR.

typo in func name in the comment

…m#20746) Includes difficulty tests for EIP2384 aka MuirGlacier.

This is supposed to fix the occasional failures in TestCancel* on Travis CI.

eth: fix transaction announce/broadcast goroutine leak

go.mod: update golang.org/x/crypto to fix a Go 1.14 race rejection

)

This revision of go-duktype fixes the following warning ``` duk_logging.c: In function ‘duk__logger_prototype_log_shared’: duk_logging.c:184:64: warning: ‘Z’ directive writing 1 byte into a region of size between 0 and 9 [-Wformat-overflow=] 184 | sprintf((char *) date_buf, "%04d-%02d-%02dT%02d:%02d:%02d.%03dZ", | ^ In file included from /usr/include/stdio.h:867, from duk_logging.c:5: /usr/include/x86_64-linux-gnu/bits/stdio2.h:36:10: note: ‘__builtin___sprintf_chk’ output between 25 and 85 bytes into a destination of size 32 36 | return __builtin___sprintf_chk (__s, __USE_FORTIFY_LEVEL - 1, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 37 | __bos (__s), __fmt, __va_arg_pack ()); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ```

Fixes: Condition is always 'false' because 'err' is always 'nil'

This PR fixes issues in TableDatabase. TableDatabase is a wrapper of underlying ethdb.Database with an additional prefix. The prefix is applied to all entries it maintains. However when we try to retrieve entries from it we don't handle the key properly. In theory the prefix should be truncated and only user key is returned. But we don't do it in some cases, e.g. the iterator and batch replayer created from it. So this PR is the fix to these issues.

eth: when triggering a sync, check the head header TD, not block

core/rawdb: fix freezer table test error check

internal/web3ext: fix clique console apis to work on missing arguments

These tests occasionally fail on Travis.

…0776) This just prevents a false negative ERROR warning when, for some unknown reason, a user attempts to turn on the module rpc even though it's already going to be on.

Dynamic state snapshots

eth/downloader: restart the downloader after completion on new head

holiman requested a review from karalabe as a code owner February 5, 2020 12:14

holiman force-pushed the improve_updates branch 2 times, most recently from 636bf14 to 89416e1 Compare February 7, 2020 16:18

holiman changed the title ~~core/state: remove snapdb storage lock + lazily init snapshot storage…~~ core/state: improve update times Feb 13, 2020

cburgdorf mentioned this pull request Feb 20, 2020

Collect runtime metrics and push them into InfluxDB to be used by Grafana ethereum/trinity#1569

Closed

2 tasks

karalabe force-pushed the snapshot-5 branch 6 times, most recently from 586c663 to 06d4470 Compare February 25, 2020 10:51

holiman and others added 5 commits February 25, 2020 17:57

internal/ethapi: default to zero address for calls (ethereum#20702)

fadf84a

This makes eth_call and eth_estimateGas use the zero address as sender when the "from" parameter is not supplied. Co-authored-by: Felix Lange <[email protected]>

les: separate peer into clientPeer and serverPeer (ethereum#19991)

4fabd9c

* les: separate peer into clientPeer and serverPeer * les: address comments

mobile: add CallOpts.SetFrom (ethereum#20721)

cec1f29

This was missing because I forgot to wrap it when bind.CallOpts.From as added.

crypto/bn256: fix import line (ethereum#20723)

f1a7997

p2p/discv5: fix test on go 1.14 (ethereum#20724)

1e1b186

renaynay and others added 2 commits February 28, 2020 14:43

rpc: correct typo and reword comment for consistency (ethereum#20728)

01d9253

core/vm: fix method doc (ethereum#20730)

556888c

typo in func name in the comment

meowsbits and others added 26 commits March 10, 2020 10:55

tests: update tests/testdata@develop, include EIP2384 config (ethereu…

0bdb21f

…m#20746) Includes difficulty tests for EIP2384 aka MuirGlacier.

rpc: improve cancel test (ethereum#20752)

b1efff6

This is supposed to fix the occasional failures in TestCancel* on Travis CI.

eth, les: fix time sensitive unit tests (ethereum#20741)

92f3405

eth: fix transaction announce/broadcast goroutine leak

270fbfb

Merge pull request ethereum#20762 from karalabe/fix-txprop-leak

68b4b74

eth: fix transaction announce/broadcast goroutine leak

go.mod: update golang.org/x/crypto to fix a Go 1.14 race rejection

466b009

Merge pull request ethereum#20747 from karalabe/update-crypto-deps

241b283

go.mod: update golang.org/x/crypto to fix a Go 1.14 race rejection

geth retesteth: increase retesteth default http timeouts (ethereum#20767

97243f3

)

params: release Geth v1.9.12

b6f1c8d

params: begin v1.9.13 release cycle

8d7aa90

cmd/checkpoint-admin: add some documentation (ethereum#20697)

efd92d8

core/rawdb: fix freezer table test error check

20a092f

Fixes: Condition is always 'false' because 'err' is always 'nil'

eth: when triggering a sync, check the head header TD, not block

dc6e98d

Merge pull request ethereum#20780 from karalabe/fix-eth-mine-sync-race

4655b60

eth: when triggering a sync, check the head header TD, not block

internal/web3ext: fix clique console apis to work on missing arguments

e6ca195

Merge pull request ethereum#20779 from meowsbits/patch-3

36e93d2

core/rawdb: fix freezer table test error check

Merge pull request ethereum#20781 from karalabe/fix-clique-console-apis

0e6ea91

internal/web3ext: fix clique console apis to work on missing arguments

whisper/whisperv6: delete failing tests (ethereum#20788)

e943f07

These tests occasionally fail on Travis.

rpc: dont log an error if user configures --rpcapi=rpc... (ethereum#2…

93ffb85

…0776) This just prevents a false negative ERROR warning when, for some unknown reason, a user attempts to turn on the module rpc even though it's already going to be on.

Merge pull request ethereum#20152 from karalabe/snapshot-5

613af7c

Dynamic state snapshots

core/state: lazily init snapshot storage map

1c9e0af

core/state: fix flawed meter on storage reads

c71a26c

core/state: make statedb/stateobjects reuse a hasher

c57a989

core/blockchain, core/state: implement new trie prefetcher

91b0a77

holiman force-pushed the improve_updates branch from 56c37fd to 91b0a77 Compare March 23, 2020 11:00

holiman closed this Mar 23, 2020

holiman mentioned this pull request Mar 23, 2020

Snapshotter: Improve updates ethereum/go-ethereum#20796

Closed

karalabe pushed a commit that referenced this pull request Oct 8, 2021

Merge pull request #27 from karalabe/merge-interop-spec-sync7

efad795

eth/downloader: restart the downloader after completion on new head

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core/state: improve update times #27

core/state: improve update times #27

holiman commented Feb 5, 2020

holiman commented Feb 5, 2020 •

edited

Loading

holiman commented Feb 6, 2020

holiman commented Feb 7, 2020

holiman commented Feb 7, 2020 •

edited

Loading

holiman commented Feb 7, 2020

holiman commented Feb 9, 2020 •

edited

Loading

holiman commented Feb 11, 2020

holiman commented Feb 11, 2020

holiman commented Feb 13, 2020

holiman commented Feb 13, 2020 •

edited

Loading

holiman commented Feb 13, 2020

karalabe commented Feb 27, 2020

core/state: improve update times #27

core/state: improve update times #27

Conversation

holiman commented Feb 5, 2020

holiman commented Feb 5, 2020 • edited Loading

holiman commented Feb 6, 2020

holiman commented Feb 7, 2020

holiman commented Feb 7, 2020 • edited Loading

holiman commented Feb 7, 2020

Background:

Idea

holiman commented Feb 9, 2020 • edited Loading

holiman commented Feb 11, 2020

holiman commented Feb 11, 2020

holiman commented Feb 13, 2020

holiman commented Feb 13, 2020 • edited Loading

holiman commented Feb 13, 2020

karalabe commented Feb 27, 2020

holiman commented Feb 5, 2020 •

edited

Loading

holiman commented Feb 7, 2020 •

edited

Loading

holiman commented Feb 9, 2020 •

edited

Loading

holiman commented Feb 13, 2020 •

edited

Loading