Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blockchain: Infrastructure to manage block index. #1044

Merged
merged 3 commits into from
Feb 23, 2018

Conversation

davecgh
Copy link
Member

@davecgh davecgh commented Feb 16, 2018

NOTE: It is not possible to downgrade after running this PR. If you plan to test it, please make a copy of your data directory first.

This PR requires #1067. (now merged)

This adds infrastructure necessary for blockchain to independently manage and persist a full block index in a bucket. The purpose of this infrastructure is to significantly speed up startup time which currently has to load a bunch of the most recent blocks in order to rebuild the stake state and also to pave the way towards being able to have the full block index in memory allows for a lot of optimizations and greatly simplified code.

As a part of the new block index infrastructure it adds a new field to the block node struct named status that consists of bit flags for keeping tracking of the validation state of each block. The status is only stored in memory as of this commit, but it will be stored as a part of the block index in the upcoming block index migration code.

Since the field will be updated after node creation, it also introduces some new functions for interacting with the status field in a concurrent-safe fashion.

The status code is largely based on upstream commits 2492af0 and fb0d13c; however, it does not include some of the logic in the reorganization paths those commits include as that approach doesn't match the intended direction this package is moving towards.

This also includes code to migrate the existing block index in ffldb to the new format managed by the blockchain package and updates the code to use the new infrastructure.

Finally, it includes a full suite of tests to ensure proper serialization and deserialization functionality.


Testing Notes

As of this PR, the expected behavior is that there is a single migration that takes a while to complete, but subsequent startups should be significantly faster than the current code on master. In particular, the point just after the chain state is reported where it is loading the initial portions of the block index.

As the warning above notes, if you try to run an older software version after this migration has completed, you will get an error message similar to Unable to start server on [:19108]: the current blockchain database is no longer compatible with this version of the software (3 > 2).


TODO:

  • Introduce a status to block nodes which tracks various state information such as whether or not the block has been stored to the database yet and whether or not it has already passed or failed validation
  • Persist the status information to the new block index

Closes #913.

@davecgh davecgh changed the title [WIP] blockchain: Infrastructure to mlanage block index. [WIP] blockchain: Infrastructure to manage block index. Feb 16, 2018
@davecgh davecgh force-pushed the blockchain_persist_nodes branch 3 times, most recently from ccb0d15 to cf97649 Compare February 18, 2018 18:58
@davecgh
Copy link
Member Author

davecgh commented Feb 18, 2018

I've updated this to include the database migration code and to make use of the infrastructure. It is split into more than one commit for ease of review.

@davecgh davecgh force-pushed the blockchain_persist_nodes branch 7 times, most recently from 9773430 to 355c1fe Compare February 20, 2018 03:25
Copy link
Member

@dajohi dajohi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@raedah
Copy link
Contributor

raedah commented Feb 20, 2018

Tested on testnet and mainnet. Rebuilt the dcrdata db with this from scratch, which pulls all data out of dcrd via rpc. No problems found. Dcrd startup time went from 20 second to 3 seconds.

Copy link
Member

@alexlyp alexlyp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tACK, tested for a few days on testnet with no issues

@davecgh davecgh force-pushed the blockchain_persist_nodes branch from 355c1fe to c19dacb Compare February 21, 2018 23:00
@davecgh
Copy link
Member Author

davecgh commented Feb 21, 2018

Rebased to latest master and resolved conflicts.

@davecgh davecgh force-pushed the blockchain_persist_nodes branch 2 times, most recently from 8920bcb to f11f502 Compare February 22, 2018 01:45
@davecgh
Copy link
Member Author

davecgh commented Feb 22, 2018

I've updated this to include another commit which includes the new block validation status tracking and updated the PR description accordingly.

@davecgh davecgh force-pushed the blockchain_persist_nodes branch from f11f502 to 8fa9894 Compare February 22, 2018 02:45
This adds infrastructure necessary for blockchain to independently
manage and persist a full block index in a bucket.  Note that the
new infrastructure is not used yet as it will be integrated in future
commits after migrating existing databases has been done.

The purpose of this infrastructure is to significantly speed up startup
time which currently has to load a bunch of the most recent blocks in
order to rebuild the stake state and also to pave the way towards being
able to have the full block index in memory allows for a lot of
optimizations and greatly simplified code.

This also includes a full suite of tests to ensure proper serialization
and deserialization functionality.
@davecgh davecgh force-pushed the blockchain_persist_nodes branch 3 times, most recently from 62be45a to a1ba8f7 Compare February 22, 2018 16:00
@davecgh
Copy link
Member Author

davecgh commented Feb 22, 2018

The migration commit has now been updated to include persisting the new status information as a part of the block index record. Note that if you previously tested this PR, the previously migrated database will have to be removed as it is not compatible with the latest changes.

@davecgh davecgh changed the title [WIP] blockchain: Infrastructure to manage block index. blockchain: Infrastructure to manage block index. Feb 22, 2018
@davecgh
Copy link
Member Author

davecgh commented Feb 22, 2018

This is now ready for final review and testing.

@@ -745,6 +745,12 @@ func (b *BlockChain) getReorganizeNodes(node *blockNode) (*list.List, *list.List
return detachNodes, attachNodes, nil
}

// Don't allow a reorganize to a descendant of an known invalid block.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correction: an -> a

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@davecgh davecgh force-pushed the blockchain_persist_nodes branch from a1ba8f7 to 0e3cb83 Compare February 22, 2018 16:12
@davecgh davecgh force-pushed the blockchain_persist_nodes branch 2 times, most recently from 6f0659a to 5bbe058 Compare February 22, 2018 17:29
@dajohi
Copy link
Member

dajohi commented Feb 22, 2018

tOK - mined on testnet2

@davecgh
Copy link
Member Author

davecgh commented Feb 22, 2018

Some additional timing information:

Time to migrate testnet2 on a 7200 RPM HDD:

CHAN: Done upgrading block index.  Total entries: 241152 in 221 seconds 

Startup time before migration: ~2m34s
Startup time after migration: ~9s

Time to migrate mainnet on an SSD:

CHAN: Done upgrading block index.  Total entries: 215262 in 59 seconds

Startup time before migration: ~21s
Startup time after migration: ~2.7s

Copy link
Member

@chappjc chappjc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment corrections.

all else LGTM

@@ -126,13 +126,37 @@ func (b *BlockChain) maybeAcceptBlock(block *dcrutil.Block, flags BehaviorFlags)
return false, ruleError(ErrMissingParent, str)
}

// The is no need to validate the block if an ancestor is already known
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --> There

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

// doBatch contains the primary logic for upgrading the block index from
// version 1 to 2 in batches. This is done because attempting to migrate in
// a single database transaction could result in massive memory usage and
// could potentially crash on many systems due to ulimits.
Copy link
Member

@chappjc chappjc Feb 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"on many" --> "on many"

(there is an extra space that github comments hide)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

return 0, fmt.Errorf("bucket %s does not exist", hashIdxBucketName)
}

// Migrate block index entries so long as the max number of entries for
Copy link
Member

@chappjc chappjc Feb 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"entries so" --> "entries so"

(there is an extra space that github comments hide)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@davecgh davecgh force-pushed the blockchain_persist_nodes branch from 5bbe058 to 6f7993d Compare February 22, 2018 19:47
This adds a new field to the block node struct named status that
consists of bit flags for keeping tracking of the validation state of
each block.  The status is only stored in memory as of this commit, but
it will be stored as a part of the block index in the upcoming block
index migration code.

Since the field will be updated after node creation, this also
introduces some new functions for interacting with the status field in a
concurrent-safe fashion.

This is largely based on upstream commits 2492af0 and fb0d13c; however,
it does not include some of the logic in the reorganization paths those
commits include as that approach doesn't match the intended direction
this package is moving towards.
This adds code to migrate the existing block index in ffldb to the new
format managed by the blockchain package and updates the code to use the
new infrastructure.
@davecgh davecgh force-pushed the blockchain_persist_nodes branch from 6f7993d to 7056c67 Compare February 22, 2018 19:57
@raedah
Copy link
Contributor

raedah commented Feb 22, 2018

Tested this again with the same procedure as before, and found no problems.

@davecgh davecgh merged commit 7056c67 into decred:master Feb 23, 2018
@davecgh davecgh deleted the blockchain_persist_nodes branch February 23, 2018 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants