This repository has been archived by the owner on Jan 22, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Build InfluxDB query to help set transaction costs #19627
Comments
Creating this panel to visualize these stats, it should help for monitoring and troubleshooting. |
My InfluxDB Queries are lousy, I hope to they at least provides an consistent method to allow easily repeat the process to get cost. |
While #19789 being looked at, can use log to get
|
tao-stones
added a commit
to tao-stones/solana
that referenced
this issue
Sep 29, 2021
- update cost calculation to closely proposed fee schedule solana-labs#16984
tao-stones
added a commit
to tao-stones/solana
that referenced
this issue
Sep 29, 2021
- update cost calculation to closely proposed fee schedule solana-labs#16984
tao-stones
added a commit
to tao-stones/solana
that referenced
this issue
Sep 29, 2021
- update cost calculation to closely proposed fee schedule solana-labs#16984
tao-stones
added a commit
to tao-stones/solana
that referenced
this issue
Sep 29, 2021
- update cost calculation to closely proposed fee schedule solana-labs#16984
tao-stones
added a commit
to tao-stones/solana
that referenced
this issue
Sep 29, 2021
- update cost calculation to closely proposed fee schedule solana-labs#16984
tao-stones
added a commit
to tao-stones/solana
that referenced
this issue
Sep 29, 2021
- update cost calculation to closely proposed fee schedule solana-labs#16984
tao-stones
added a commit
to tao-stones/solana
that referenced
this issue
Sep 29, 2021
- update cost calculation to closely proposed fee schedule solana-labs#16984
tao-stones
added a commit
to tao-stones/solana
that referenced
this issue
Sep 29, 2021
- update cost calculation to closely proposed fee schedule solana-labs#16984
tao-stones
added a commit
to tao-stones/solana
that referenced
this issue
Sep 29, 2021
- update cost calculation to closely proposed fee schedule solana-labs#16984
tao-stones
added a commit
to tao-stones/solana
that referenced
this issue
Sep 30, 2021
- update cost calculation to closely proposed fee schedule solana-labs#16984
tao-stones
added a commit
to tao-stones/solana
that referenced
this issue
Sep 30, 2021
- update cost calculation to closely proposed fee schedule solana-labs#16984
tao-stones
added a commit
to tao-stones/solana
that referenced
this issue
Oct 6, 2021
- update cost calculation to closely proposed fee schedule solana-labs#16984
tao-stones
added a commit
that referenced
this issue
Oct 6, 2021
* Cost Model to limit transactions which are not parallelizeable (#16694) * * Add following to banking_stage: 1. CostModel as immutable ref shared between threads, to provide estimated cost for transactions. 2. CostTracker which is shared between threads, tracks transaction costs for each block. * replace hard coded program ID with id() calls * Add Account Access Cost as part of TransactionCost. Account Access cost are weighted differently between read and write, signed and non-signed. * Establish instruction_execution_cost_table, add function to update or insert instruction cost, unit tested. It is read-only for now; it allows Replay to insert realtime instruction execution costs to the table. * add test for cost_tracker atomically try_add operation, serves as safety guard for future changes * check cost against local copy of cost_tracker, return transactions that would exceed limit as unprocessed transaction to be buffered; only apply bank processed transactions cost to tracker; * bencher to new banking_stage with max cost limit to allow cost model being hit consistently during bench iterations * replay stage feed back program cost (#17731) * replay stage feeds back realtime per-program execution cost to cost model; * program cost execution table is initialized into empty table, no longer populated with hardcoded numbers; * changed cost unit to microsecond, using value collected from mainnet; * add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs. * investigate system performance test degradation (#17919) * Add stats and counter around cost model ops, mainly: - calculate transaction cost - check transaction can fit in a block - update block cost tracker after transactions are added to block - replay_stage to update/insert execution cost to table * Change mutex on cost_tracker to RwLock * removed cloning cost_tracker for local use, as the metrics show clone is very expensive. * acquire and hold locks for block of TXs, instead of acquire and release per transaction; * remove redundant would_fit check from cost_tracker update execution path * refactor cost checking with less frequent lock acquiring * avoid many Transaction_cost heap allocation when calculate cost, which is in the hot path - executed per transaction. * create hashmap with new_capacity to reduce runtime heap realloc. * code review changes: categorize stats, replace explicit drop calls, concisely initiate to default * address potential deadlock by acquiring locks one at time * Persist cost table to blockstore (#18123) * Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks * Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()` * Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time * Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory * Only try to persist to blockstore when cost_table is changed. * Restore cost table during validator startup * Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads; * Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model. * log warning when channel send fails (#18391) * Aggregate cost_model into cost_tracker (#18374) * * aggregate cost_model into cost_tracker, decouple it from banking_stage to prevent accidental deadlock. * Simplified code, removed unused functions * review fixes * update ledger tool to restore cost table from blockstore (#18489) * update ledger tool to restore cost model from blockstore when compute-slot-cost * Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool * refactor and simplify a test * manually fix merge conflicts * Per-program id timings (#17554) * more manual fixing * solve a merge conflict * featurize cost model * more merge fix * cost model uses compute_unit to replace microsecond as cost unit (#18934) * Reject blocks for costs above the max block cost (#18994) * Update block max cost limit to fix performance regession (#19276) * replace function with const var for better readability (#19285) * Add few more metrics data points (#19624) * periodically report sigverify_stage stats (#19674) * manual merge * cost model nits (#18528) * Accumulate consumed units (#18714) * tx wide compute budget (#18631) * more manual merge * ignore zerorize drop security * - update const cost values with data collected by #19627 - update cost calculation to closely proposed fee schedule #16984 * add transaction cost histogram metrics (#20350) * rebase to 1.7.15 * add tx count and thread id to stats (#20451) each stat reports and resets when slot changes * remove cost_model feature_set * ignore vote transactions from cost model Co-authored-by: sakridge <[email protected]> Co-authored-by: Jeff Biseda <[email protected]> Co-authored-by: Jack May <[email protected]>
t-nelson
pushed a commit
that referenced
this issue
Oct 6, 2021
* Cost Model to limit transactions which are not parallelizeable (#16694) * * Add following to banking_stage: 1. CostModel as immutable ref shared between threads, to provide estimated cost for transactions. 2. CostTracker which is shared between threads, tracks transaction costs for each block. * replace hard coded program ID with id() calls * Add Account Access Cost as part of TransactionCost. Account Access cost are weighted differently between read and write, signed and non-signed. * Establish instruction_execution_cost_table, add function to update or insert instruction cost, unit tested. It is read-only for now; it allows Replay to insert realtime instruction execution costs to the table. * add test for cost_tracker atomically try_add operation, serves as safety guard for future changes * check cost against local copy of cost_tracker, return transactions that would exceed limit as unprocessed transaction to be buffered; only apply bank processed transactions cost to tracker; * bencher to new banking_stage with max cost limit to allow cost model being hit consistently during bench iterations * replay stage feed back program cost (#17731) * replay stage feeds back realtime per-program execution cost to cost model; * program cost execution table is initialized into empty table, no longer populated with hardcoded numbers; * changed cost unit to microsecond, using value collected from mainnet; * add ExecuteCostTable with fixed capacity for security concern, when its limit is reached, programs with old age AND less occurrence will be pushed out to make room for new programs. * investigate system performance test degradation (#17919) * Add stats and counter around cost model ops, mainly: - calculate transaction cost - check transaction can fit in a block - update block cost tracker after transactions are added to block - replay_stage to update/insert execution cost to table * Change mutex on cost_tracker to RwLock * removed cloning cost_tracker for local use, as the metrics show clone is very expensive. * acquire and hold locks for block of TXs, instead of acquire and release per transaction; * remove redundant would_fit check from cost_tracker update execution path * refactor cost checking with less frequent lock acquiring * avoid many Transaction_cost heap allocation when calculate cost, which is in the hot path - executed per transaction. * create hashmap with new_capacity to reduce runtime heap realloc. * code review changes: categorize stats, replace explicit drop calls, concisely initiate to default * address potential deadlock by acquiring locks one at time * Persist cost table to blockstore (#18123) * Add `ProgramCosts` Column Family to blockstore, implement LedgerColumn; add `delete_cf` to Rocks * Add ProgramCosts to compaction excluding list alone side with TransactionStatusIndex in one place: `excludes_from_compaction()` * Write cost table to blockstore after `replay_stage` replayed active banks; add stats to measure persist time * Deletes program from `ProgramCosts` in blockstore when they are removed from cost_table in memory * Only try to persist to blockstore when cost_table is changed. * Restore cost table during validator startup * Offload `cost_model` related operations from replay main thread to dedicated service thread, add channel to send execute_timings between these threads; * Move `cost_update_service` to its own module; replay_stage is now decoupled from cost_model. * log warning when channel send fails (#18391) * Aggregate cost_model into cost_tracker (#18374) * * aggregate cost_model into cost_tracker, decouple it from banking_stage to prevent accidental deadlock. * Simplified code, removed unused functions * review fixes * update ledger tool to restore cost table from blockstore (#18489) * update ledger tool to restore cost model from blockstore when compute-slot-cost * Move initialize_cost_table into cost_model, so the function can be tested and shared between validator and ledger-tool * refactor and simplify a test * manually fix merge conflicts * Per-program id timings (#17554) * more manual fixing * solve a merge conflict * featurize cost model * more merge fix * cost model uses compute_unit to replace microsecond as cost unit (#18934) * Reject blocks for costs above the max block cost (#18994) * Update block max cost limit to fix performance regession (#19276) * replace function with const var for better readability (#19285) * Add few more metrics data points (#19624) * periodically report sigverify_stage stats (#19674) * manual merge * cost model nits (#18528) * Accumulate consumed units (#18714) * tx wide compute budget (#18631) * more manual merge * ignore zerorize drop security * - update const cost values with data collected by #19627 - update cost calculation to closely proposed fee schedule #16984 * add transaction cost histogram metrics (#20350) * rebase to 1.7.15 * add tx count and thread id to stats (#20451) each stat reports and resets when slot changes * remove cost_model feature_set * ignore vote transactions from cost model Co-authored-by: sakridge <[email protected]> Co-authored-by: Jeff Biseda <[email protected]> Co-authored-by: Jack May <[email protected]>
tao-stones
added a commit
to tao-stones/solana
that referenced
this issue
Oct 7, 2021
- update cost calculation to closely proposed fee schedule solana-labs#16984
tao-stones
added a commit
that referenced
this issue
Oct 8, 2021
Here is the dashboard for testnet: |
dankelleher
pushed a commit
to identity-com/solana
that referenced
this issue
Nov 24, 2021
…olana-labs#20314) - update cost calculation to closely proposed fee schedule solana-labs#16984
frits-metalogix
added a commit
to identity-com/solana
that referenced
this issue
Nov 24, 2021
…#19627 (solana-labs#20314)" This reverts commit 573e57d.
already done |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Problem
cost_model
must calculate to the samecost
for a given transaction from any validator machine, for stability of the cluster. Therefore the cost of each components in proposed Fee structure must be assigned with a static (eg hardcoded) number, similar to Compute Budget.For consistency, transaction cost should be calculated closely following proposed fee schedule, therefore the need of a consistent method to assign/collaborate
compute units
to each cost elements, such assignature
,write lock
etc.Method:
The raw data are metrics measuring each operation's CPU cost, mostly in
us
. When collecting metrics, we need to get cluster-aggregated data to ride of individual validator performance differences. It takes two steps to definecompute units
for each Transaction cost elements:compute unit
tous
conversion rate.This rate essentially describes the relationship between hardcoded BPF instruction's operation budget (in
compute unit
) vsus
, which should be the same relationship for signature verification, or built-in program. In another words, it can convert other transaction cost element fromus
tocompute unit
.PR #19624 added few more data points, if
tx_wide_compute_cap
feature is turned on, to add BPF instruction'scompute unit
andus
into InfluxDB,Run query to get
conversion_rate
:When I ran validator (PR #19624 with

tx_wide_compute_cap
on) on GCE VM for 15 mins, I gotconversion_rate = 15.03
:conversion_rate
to convert intocomputye unit
In the same test, the query return
933 us
for each signature verification, therefore its cost =933 us * convertion_rate = 933 us * 15 cu/us = 13,995 cu
.The cost of write lock is rather hard to determine. A write lock potentially reduces block producing parallelism, therefore tps.
In lieu of better baseline, we could the cost of load and store an account as cost of a write lock; A transactions cost of write locks is, therefore,
number of writable accounts (excluding program) * load_n_store_account_us * conversion_rate
.Query to get
load_n_store_account_us
:The reported execute timing stats includes total_data_size in transactions, and the overall execute_us, using that to roughly estimate the number of bytes can be executed in a micro-sec.
A transaction’s cost of data is
(transaction.data_len / data_size_per_us) * conversion_rate
.Query to get
data_size_per_us
:Using following queries to scan cluster for mean execute micro sec for each built-in program, using rate to convert to cu.
Notes
feature
to change the costsThe text was updated successfully, but these errors were encountered: