forked from bitcoin/bitcoin
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kernel: merge all options into ChainstateManagerOptions; disallow invalid wiping settings #24
Open
stickies-v
wants to merge
25
commits into
TheCharlatan:kernelApi
Choose a base branch
from
stickies-v:kernel/merge-chainman-options
base: kernelApi
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
kernel: merge all options into ChainstateManagerOptions; disallow invalid wiping settings #24
stickies-v
wants to merge
25
commits into
TheCharlatan:kernelApi
from
stickies-v:kernel/merge-chainman-options
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
As a first step, implement the equivalent of what was implemented in the now deprecated libbitcoinconsensus header. Also add a test binary to exercise the header and library. Unlike the deprecated libbitcoinconsensus the kernel library can now use the hardware-accelerated sha256 implementations thanks for its statically-initialzed context. The functions kept around for backwards-compatibility in the libbitcoinconsensus header are not ported over. As a new header, it should not be burdened by previous implementations. Also add a new error code for handling invalid flag combinations, which would otherwise cause a crash. The macros used in the new C header were adapted from the libsecp256k1 header. To make use of the C header from C++ code, a C++ header is also introduced for wrapping the C header. This makes it safer and easier to use from C++ code.
Exposing logging in the kernel library allows users to follow what is going on when using it. Users of the C header can use `kernel_logging_connection_create(...)` to pass a callback function to Bitcoin Core's internal logger. Additionally the level and severity can be globally configured. By default, the logger buffers messages until `kernel_loggin_connection_create(...)` is called. If the user does not want any logging messages, it is recommended that `kernel_disable_logging()` is called, which permanently disables the logging and any buffering of messages.
The context introduced here holds the objects that will be required for running validation tasks, such as the chosen chain parameters, callbacks for validation events, and an interrupt utility. These will be used in a few commits, once the chainstate manager is introduced. This commit also introduces conventions for defining option objects. A common pattern throughout the C header will be: ``` options = object_option_create(); object = object_create(options); ``` This allows for more consistent usage of a "builder pattern" for objects where options can be configured independently from instantiation.
As a first option, add the chainparams. For now these can only be instantiated with default values. In future they may be expanded to take their own options for regtest and signet configurations. This commit also introduces a unique pattern for setting the option values when calling the `*_set(...)` function.
The notifications are used for notifying on connected blocks and on warning and fatal error conditions. The user of the C header may define callbacks that gets passed to the internal notification object in the `kernel_NotificationInterfaceCallbacks` struct. Each of the callbacks take a `user_data` argument that gets populated from the `user_data` value in the struct. It can be used to recreate the structure containing the callbacks on the user's side, or to give the callbacks additional contextual information.
This is the main driver class for anything validation related, so expose it here. Creating the chainstate manager and block manager options will currently also trigger the creation of their respectively configured directories. The chainstate manager and block manager options were not consolidated into a single object, since the kernel might eventually introduce a block manager object for the purposes of being a light-weight block store reader. The chainstate manager will associate with the context with which it was created for the duration of its lifetime. It is only valid if that context remains in memory too. The tests now also create dedicated temporary directories. This is similar to the behaviour in the existing unit test framework.
Re-use the same pattern used for the context options. This allows users to set the number of threads used in the validation thread pool.
The `kernel_chainstate_manager_load_chainstate(...)` function is the final step required to prepare the chainstate manager for future tasks. Its main responsibility is loading the coins and block tree indexes. Though its `context` argument is not strictly required this was added to ensure that the context remains in memory for this operation. This pattern of a "dummy" context will be re-used for functions introduced in later commits. The chainstate load options will be populated over the next few commits.
The added function allows the user process and validate a given block with the chainstate manager. The *_process_block(...) function does some preliminary checks on the block before passing it to `ProcessNewBlock(...)`. These are similar to the checks in the `submitblock()` rpc. Richer processing of the block validation result will be made available in the following commits through the validation interface. The commits also adds a utility for serializing a `CBlock` (`kernel_block_create()`) that may then be passed to the library for processing. The tests exercise the function for both mainnet and regtest. The commit also adds the data of 206 regtest blocks (some blocks also contain transactions).
Adds options for wiping the chainstate and block tree indexes to the chainstate load options. In combination and once the `*_import_blocks(...)` function is added in a later commit, this triggers a reindex. For now, it just wipes the existing data.
This allows a user to run the kernel without creating on-disk files for the block tree and chainstate indexes. This is potentially useful in scenarios where the user needs to do some ephemeral validation operations. One specific use case is when linearizing the blocks on disk. The block files store blocks out of order, so a program may utilize the library and its header to read the blocks with one chainstate manager, and then write them back in order, and without orphans, with another chainstate maanger. To save disk resources and if the indexes are not required once done, it may be beneficial to keep the indexes in memory for the chainstate manager that writes the blocks back again.
The `kernel_import_blocks` function is used to both trigger a reindex, if the indexes were previously wiped through the chainstate load options, or import the block data of a single block file. The behaviour of the import can be verified through the test logs.
Calling interrupt can halt long-running functions associated with objects that were created through the passed-in context.
This adds the infrastructure required to process validation events. For now the external validation interface only has support for the `BlockChecked` callback, but support for the other internal validation interface methods can be added in the future. The validation interface follows an architecture for defining its callbacks and ownership that is similar to the notifications. The task runner is created internally with a context, which itself internally creates a unique ValidationSignals object. When the user creates a new chainstate manager the validation signals are internally passed to the chainstate manager through the context. The callbacks block any further validation execution when they are called. It is up to the user to either multiplex them, or use them otherwise in a multithreaded mechanism to make processing the validation events non-blocking. A validation interface can register for validation events with a context. Internally the passed in validation interface is registerd with the validation signals of a context. The BlockChecked callback introduces a seperate type for a non-owned block. Since a library-internal object owns this data, the user needs to be explicitly prevented from deleting it. In a later commit a utility will be added to copy its data.
These allow for the interpretation of the data in a `BlockChecked` validation interface callback. This is useful to get richer information in case a block failed to validate.
This adds functions for copying serialized block data into a user-owned variable-sized byte array. Use it in the tests for verifying the implementation of the validation interface's `BlockChecked` method.
This adds functions for reading a block from disk with a retrieved block index entry. External services that wish to build their own index, or analyze blocks can use this to retrieve block data. The block index can now be traversed from the tip backwards. This is guaranteed to work, since the chainstate maintains an internal block tree index in memory and every block (besides the genesis) has an ancestor. The user can use this function to iterate through all blocks in the chain (starting from the tip). Once the block index entry for the genesis block is reached a nullptr is returned if the user attempts to get the previous entry.
This adds functions for reading the undo data from disk with a retrieved block index entry. The undo data of a block contains all the spent script pubkeys of all the transactions in a block. In normal operations undo data is used during re-orgs. This data might also be useful for building external indexes, or to scan for silent payment transactions. Internally the block undo data contains a vector of transaction undo data which contains a vector of the spent outputs. For this reason, the `kernel_get_block_undo_size(...)` function is added to the header for retrieving the size of the transaction undo data vector, as well as the `kernel_get_transaction_undo_size(...) function for retrieving the size of each spent outputs vector contained within each transaction undo data entry. With these two sizes the user can iterate through the undo data by accessing the transaction outputs by their indeces with `kernel_get_undo_output_by_index`. If an invalid index is passed in, the `kernel_ERROR_OUT_OF_BOUNDS` error is returned again. The returned `kernel_TransactionOutput` is entirely owned by the user and may be destroyed with the `kernel_transaction_output_destroy(...)` convenience function.
Adds further functions useful for traversing the block index and retrieving block information. This includes getting the block height and hash.
This is useful for a host block processing feature where having an identifier for the block is needed. Without this, external users need to serialize the block and calculate the hash externally, which is less efficient.
This showcases a re-implementation of bitcoin-chainstate only using the kernel C++ API header.
Future commits will merge kernel_BlockManagerOptions and kernel_ChainstateLoadOptions into kernel_ChainstateManagerOptions.
Instead, expose it via kernel_ChainstateManagerOptions
Instead, expose it via kernel_ChainstateManagerOptions
Instead of failing the chainstate_manager_create call, disallow setting the set_wipe_block_tree_db and set_wipe_chainstate_db options to invalid combinations. This improves user error feedback, removing the need to inspect the logs when creating a ChainstateManager fails.
817865d
to
efbd187
Compare
a604321
to
2b743ad
Compare
c72b2c2
to
2951395
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR addresses two concerns I've been having over the past weeks using the kernel API:
kernel_chainstate_manager_create
fails whenset_wipe_block_tree_db
was set toTrue
andset_wipe_chainstate_db
toFalse
.BlockManager
orChainstateLoad
functionality, yet the user must instantiateOptions
objects for it.Fix both issues by merging all
kernel_BlockManagerOptions
andkernel_ChainstateLoadOptions
functionality intokernel_ChainstateManagerOptions
(introducing a new, internalChainstateManagerOptionsWrapper
struct), which allows bothwipe
setters to inspect the other, and return whether or not the operation was valid. This should be a lot more ergonomic thankernel_chainstate_manager_create
just returning anullptr
, which can happen because of many reasons.A couple of thoughts:
chainstate_manager_options_*_wipe_*
functions can be a little awkward to use. For example, when bothwipe
options need to be set to True, you can only do it by first settingchainstate
. While not ideal, I think a clunky but correct-by-construction API is better than an easier-to-use, but also easier-to-abuse (and harder to understand) API?wipe
options approach is suboptimal and may be removed in future force-pushes anyway, but I think merging theOptions
objects is an improvement regardless?kernel_BlockManager
is on the roadmap, but I expect this won't be in the very short term. Imo, it makes more sense to introduce theOptions
object when we actually need it, with the layout that makes most sense at that time? And when that happens, I suspect we'll have to make some changes anyway (e.g., I thinkBlockManagerOptions
should probably be aChainstateManagerOptions
member)datadir
setting, so they can't be set to different values anymore.