Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop overflow cache #5891

Merged
merged 5 commits into from
Jul 11, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 0 additions & 8 deletions beacon_node/beacon_chain/src/beacon_chain.rs
Original file line number Diff line number Diff line change
Expand Up @@ -721,13 +721,6 @@ impl<T: BeaconChainTypes> BeaconChain<T> {
Ok(())
}

pub fn persist_data_availability_checker(&self) -> Result<(), Error> {
let _timer = metrics::start_timer(&metrics::PERSIST_DATA_AVAILABILITY_CHECKER);
self.data_availability_checker.persist_all()?;

Ok(())
}

/// Returns the slot _right now_ according to `self.slot_clock`. Returns `Err` if the slot is
/// unavailable.
///
Expand Down Expand Up @@ -6699,7 +6692,6 @@ impl<T: BeaconChainTypes> Drop for BeaconChain<T> {
let drop = || -> Result<(), Error> {
self.persist_head_and_fork_choice()?;
self.persist_op_pool()?;
self.persist_data_availability_checker()?;
self.persist_eth1_cache()
};

Expand Down
42 changes: 27 additions & 15 deletions beacon_node/beacon_chain/src/data_availability_checker.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ use crate::blob_verification::{verify_kzg_for_blob_list, GossipVerifiedBlob, Kzg
use crate::block_verification_types::{
AvailabilityPendingExecutedBlock, AvailableExecutedBlock, RpcBlock,
};
use crate::data_availability_checker::overflow_lru_cache::OverflowLRUCache;
use crate::data_availability_checker::overflow_lru_cache::DataAvailabilityCheckerInner;
use crate::{BeaconChain, BeaconChainTypes, BeaconStore};
use kzg::Kzg;
use slog::{debug, error, Logger};
Expand Down Expand Up @@ -33,12 +33,30 @@ pub const OVERFLOW_LRU_CAPACITY: NonZeroUsize = new_non_zero_usize(1024);
pub const STATE_LRU_CAPACITY_NON_ZERO: NonZeroUsize = new_non_zero_usize(2);
pub const STATE_LRU_CAPACITY: usize = STATE_LRU_CAPACITY_NON_ZERO.get();

/// This includes a cache for any blocks or blobs that have been received over gossip or RPC
/// and are awaiting more components before they can be imported. Additionally the
/// `DataAvailabilityChecker` is responsible for KZG verification of block components as well as
/// checking whether a "availability check" is required at all.
/// Cache to hold fully valid data that can't be imported to fork-choice yet. After dencun hard-fork
/// blocks have a sidecar of data that is received separately from the network. We call the concept
/// of a block "becoming available" when all of its import dependencies are inserted into this
/// cache.
///
/// Usually a block becomes available on its slot within a second of receiving its first component
/// over gossip. However, a block may never become available if a malicious proposer does not
/// publish its data, or there are network issues that prevent us from receiving it. If the block
/// does not become available after some time we can safely forget about it. Consider this two
/// cases:
///
/// - Global unavailability: If nobody has received the block components it's likely that the
/// proposer never made the block available. So we can safely forget about the block as it will
/// never become available.
/// - Local unavailability: Some of our peers will attest to a descendant of the dropped block and
/// lookup sync will eventually fetch its components
///
/// Even in periods of non-finality, the proposer is expected to publish the block's data
/// immediately. Because this cache only holds fully valid data, its capacity is bounded to 1 block
/// per slot and fork: before inserting into this cache we check proposer signature and correct
/// proposer. Having a capacity > 1 is an optimization to prevent sync lookup from having re-fetch
/// data during moments of unstable network conditions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great docs!

pub struct DataAvailabilityChecker<T: BeaconChainTypes> {
availability_cache: Arc<OverflowLRUCache<T>>,
availability_cache: Arc<DataAvailabilityCheckerInner<T>>,
slot_clock: T::SlotClock,
kzg: Option<Arc<Kzg>>,
log: Logger,
Expand Down Expand Up @@ -74,7 +92,8 @@ impl<T: BeaconChainTypes> DataAvailabilityChecker<T> {
log: &Logger,
spec: ChainSpec,
) -> Result<Self, AvailabilityCheckError> {
let overflow_cache = OverflowLRUCache::new(OVERFLOW_LRU_CAPACITY, store, spec.clone())?;
let overflow_cache =
DataAvailabilityCheckerInner::new(OVERFLOW_LRU_CAPACITY, store, spec.clone())?;
Ok(Self {
availability_cache: Arc::new(overflow_cache),
slot_clock,
Expand Down Expand Up @@ -329,15 +348,9 @@ impl<T: BeaconChainTypes> DataAvailabilityChecker<T> {
})
}

/// Persist all in memory components to disk
pub fn persist_all(&self) -> Result<(), AvailabilityCheckError> {
self.availability_cache.write_all_to_disk()
}

/// Collects metrics from the data availability checker.
pub fn metrics(&self) -> DataAvailabilityCheckerMetrics {
DataAvailabilityCheckerMetrics {
num_store_entries: self.availability_cache.num_store_entries(),
state_cache_size: self.availability_cache.state_cache_size(),
block_cache_size: self.availability_cache.block_cache_size(),
}
Expand All @@ -346,7 +359,6 @@ impl<T: BeaconChainTypes> DataAvailabilityChecker<T> {

/// Helper struct to group data availability checker metrics.
pub struct DataAvailabilityCheckerMetrics {
pub num_store_entries: usize,
pub state_cache_size: usize,
pub block_cache_size: usize,
}
Expand All @@ -372,7 +384,7 @@ pub fn start_availability_cache_maintenance_service<T: BeaconChainTypes>(

async fn availability_cache_maintenance_service<T: BeaconChainTypes>(
chain: Arc<BeaconChain<T>>,
overflow_cache: Arc<OverflowLRUCache<T>>,
overflow_cache: Arc<DataAvailabilityCheckerInner<T>>,
) {
let epoch_duration = chain.slot_clock.slot_duration() * T::EthSpec::slots_per_epoch() as u32;
loop {
Expand Down
Loading
Loading