Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

Add db recovery methods #10838

Merged
merged 1 commit into from
Jul 6, 2020
Merged

Conversation

sakridge
Copy link
Contributor

Problem

Blockstore encounters errors opening the db because wal/sst corruption.

Summary of Changes

Allow for skipping corrupted wal entries.

Fixes #10015

@sakridge sakridge force-pushed the wal-recovery-flag branch 5 times, most recently from 440bf9e to 76e0668 Compare June 29, 2020 23:42
@codecov
Copy link

codecov bot commented Jun 30, 2020

Codecov Report

Merging #10838 into master will decrease coverage by 0.0%.
The diff coverage is 43.2%.

@@            Coverage Diff            @@
##           master   #10838     +/-   ##
=========================================
- Coverage    82.0%    81.9%   -0.1%     
=========================================
  Files         310      310             
  Lines       71884    71906     +22     
=========================================
+ Hits        58949    58955      +6     
- Misses      12935    12951     +16     

validator/src/main.rs Outdated Show resolved Hide resolved
@sakridge sakridge force-pushed the wal-recovery-flag branch from 76e0668 to 96028d1 Compare June 30, 2020 18:30
@sakridge sakridge marked this pull request as ready for review June 30, 2020 18:32
@sakridge sakridge force-pushed the wal-recovery-flag branch from 96028d1 to da1838f Compare July 2, 2020 03:03
@sakridge sakridge requested review from t-nelson and carllin July 2, 2020 03:03
validator/src/main.rs Outdated Show resolved Hide resolved
@t-nelson t-nelson self-requested a review July 2, 2020 15:03
t-nelson
t-nelson previously approved these changes Jul 2, 2020
Copy link
Contributor

@t-nelson t-nelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just one nit

FYI, there's a DB to test this against on ba3, #10015 (comment)

@sakridge sakridge force-pushed the wal-recovery-flag branch from da1838f to f87189e Compare July 6, 2020 18:17
@mergify mergify bot dismissed t-nelson’s stale review July 6, 2020 18:18

Pull request has been modified.

@sakridge
Copy link
Contributor Author

sakridge commented Jul 6, 2020

LGTM! Just one nit

FYI, there's a DB to test this against on ba3, #10015 (comment)

Cool, I added the option to solana-ledger-tool and also implemented From<&str> to share the logic. I confirmed the ba3 database can be opened with --wal-recovery-mode skip_any_corrupted_record without errors.

@sakridge sakridge force-pushed the wal-recovery-flag branch from f87189e to b96180d Compare July 6, 2020 18:26
@sakridge sakridge force-pushed the wal-recovery-flag branch from b96180d to a22aab0 Compare July 6, 2020 18:27
@sakridge sakridge added the automerge Merge this Pull Request automatically once CI passes label Jul 6, 2020
@sakridge sakridge merged commit 58a475b into solana-labs:master Jul 6, 2020
@sakridge sakridge deleted the wal-recovery-flag branch July 6, 2020 19:43
sakridge added a commit to sakridge/solana that referenced this pull request Aug 9, 2020
sakridge added a commit to sakridge/solana that referenced this pull request Aug 9, 2020
sakridge added a commit to sakridge/solana that referenced this pull request Aug 10, 2020
sakridge added a commit to sakridge/solana that referenced this pull request Aug 10, 2020
sakridge added a commit that referenced this pull request Aug 10, 2020
@ryoqun
Copy link
Contributor

ryoqun commented Jul 27, 2021

@sakridge I think we can finally revisit this workaround fix for the Corruption: SST file is ahead of WALs errors from rocksdb.

Specifically, this relevant fix is finally landed in upstream and available in our codebase starting from: #18904

rust-rocksdb/rust-rocksdb#531 (comment)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
automerge Merge this Pull Request automatically once CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failed to open ledger database: RocksDb(Error { message: "Corruption: SST file is ahead of WALs"
4 participants