-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for responding to a LocalStateQuery demand for a snapshot of big ledger peers #1067
Support for responding to a LocalStateQuery demand for a snapshot of big ledger peers #1067
Conversation
3f419fa
to
5ef2b15
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, only minor comments.
FTR: This builds on top of IntersectMBO/ouroboros-network#4850
...s-consensus-cardano/src/shelley/Ouroboros/Consensus/Shelley/Ledger/NetworkProtocolVersion.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus-cardano/src/shelley/Ouroboros/Consensus/Shelley/Ledger/Query.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus-cardano/src/shelley/Ouroboros/Consensus/Shelley/Ledger/Query.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus-cardano/src/shelley/Ouroboros/Consensus/Shelley/Ledger/Query.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus-cardano/src/shelley/Ouroboros/Consensus/Shelley/Ledger/Query.hs
Outdated
Show resolved
Hide resolved
fc9cfa7
to
7704377
Compare
ouroboros-consensus-cardano/src/shelley/Ouroboros/Consensus/Shelley/Ledger/Query.hs
Outdated
Show resolved
Hide resolved
@@ -434,6 +447,11 @@ instance (ShelleyCompatible proto era, ProtoCrypto proto ~ crypto) | |||
SL.queryCommitteeMembersState coldCreds hotCreds statuses st | |||
GetFilteredVoteDelegatees stakeCreds -> | |||
getFilteredVoteDelegatees st stakeCreds | |||
GetPeerSnapshot -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems relatively inexpensive, but do you know roughly whether there's a risk of a CPU and/or allocation spike when handling this query, for example?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had not analyzed this - as far as calculating the big ledger peers from the whole, it is just some folds and sorting, with strictness applied where appropriate, so it should be well behaved for reasonable input. I'm not sure about the workload of fetching raw peers from the ledger. Should I take a closer overall look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the workload of fetching raw peers from the ledger. Should I take a closer overall look?
Perhaps other members of the Networking Team (eg Marcin? Armando?) have already worried about that being "too expensive", since "getting ledger peers" is pretty primary to the Diffusion Layer?
We don't need to pin it down exactly. But it'd be regrettable if some SPO's system was performing poorly because they didn't know that they shouldn't be sending this query too often. So we should be able to document the "rough cost" or some such if that sort of thing is a risk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The costs described in #1067 (comment) seem completely fine and in line (or below) what other queries require.
1a5a8bd
to
be3cf42
Compare
I've looked into the performance impact on cpu and memory use of this query on a fully synced node without any peers connected, and results are as follows. Measurements are taken every second for a duration of 5 minutes on an [email protected] GHz with 32 GB of RAM. 5 minute baseline plot when idling: 5 minutes of snapshot requests, where to node is queried in random intervals every 0 to 1 seconds: There isn't any discernable impact on memory use - no spikes or significant/unusual growth could be determined, and any memory use triggered by the query is neglibigle. Some minor CPU use in excess of the idling load is visible on the second plot, but nothing noteworthy can be concluded from it as it appears to be within reasonable limits. The query itself executes in around 20ms on the node: The resulting snapshot file size is ~50kb. |
db1ddcf
to
7de012a
Compare
ouroboros-consensus-cardano/src/unstable-shelley-testlib/Test/Consensus/Shelley/Examples.hs
Outdated
Show resolved
Hide resolved
7de012a
to
dad791e
Compare
For the |
Yes, that's what I figured |
1f04896
to
8a82356
Compare
f764134
to
6693ea2
Compare
c88f5cf
to
6b1f13e
Compare
9b4bac8
to
c130c9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving the code that is stacked on top of #1223 👍 (last two commits)
Looks great, only two trivial comments
ouroboros-consensus-cardano/src/unstable-shelley-testlib/Test/Consensus/Shelley/Examples.hs
Outdated
Show resolved
Hide resolved
ouroboros-consensus-cardano/src/unstable-shelley-testlib/Test/Consensus/Shelley/Examples.hs
Outdated
Show resolved
Hide resolved
Ah, and the new golden files have to be checked in (that's why CI is red ATM). You can generate them locally via |
This change enables retrieval of big ledger peers. This data can be saved and loaded later, and used by the network layer to facilitate syncing up a node from a fresh or stale state.
945e64a
to
9dede29
Compare
9dede29
to
28fdf38
Compare
This change equips LocalStateQuery mini protocol app to respond to a request for a snapshot, taken from the current tip, of big ledger peers from a client, eg. cardano-cli. A new query tag
GetBigLedgerPeerSnapshot
is added toBlockQuery
. This data can be loaded later by a node, and it may be leveraged by the network layer to facilitate syncing up a node from a blank or stale state. Diffusion layer is expected to make heavy use of these relays when bootstrapping from scratch in Genesis consensus mode.