Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wallet: ignore chainStateFlushed notifications while attaching chain #24984

Merged
merged 1 commit into from
Apr 28, 2022

Conversation

mzumsande
Copy link
Contributor

Fixes #24487

When a rescan is performed during CWallet::AttachChain() (e.g. when loading an old wallet) but this is interrupted by a shutdown signal, the wallet will currently stop the rescan, receive a chainStateFlushed signal, set the saved best block to the tip and shut down. At next startup, the rescan is not continued or repeated because of this. But some blocks have never been scanned by the wallet, which could lead to an incorrect balance.

Fix this by ignoring chainStateFlushed notifications until the chain is attached. Since CWallet::chainStateFlushed is being manually called by AttachChain() anyway after finishing with the rescan, it is not a problem if intermediate notifications are ignored.

Manual rescans started / aborted by the rescanblockchain / abortrescan RPCs are not affected by this.

I didn't choose alternative ways of fixing this issue that would delay the validationinterface registration or change anything else about the handling of blockConnected signals for the reasons mentioned in this existing comment.

@mzumsande mzumsande marked this pull request as draft April 26, 2022 01:04
@mzumsande
Copy link
Contributor Author

mzumsande commented Apr 26, 2022

This breaks feature_pruning.py, putting into draft until I have analyzed it.

Fixed.

Copy link
Contributor

@S3RK S3RK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With current implementation we don't save the rescan progress and on the next load we will have to start from scratch. Is it possible to save the last processed blocked at shutdown?

@mzumsande mzumsande force-pushed the 202204_wallet_rescan branch from a384b55 to 2052e3a Compare April 26, 2022 08:13
@fanquake fanquake requested a review from ryanofsky April 26, 2022 08:39
@mzumsande
Copy link
Contributor Author

With current implementation we don't save the rescan progress and on the next load we will have to start from scratch. Is it possible to save the last processed blocked at shutdown?

Yes, that's true. I think it should be possible to call chainStateFlushed with a locator to the block that was last processed - I'll look into that. Of course it would save some time only in the rather rare situation of a clean shutdown during rescan (the bug must have existed for years but wasn't reported until very recently), so I'm not sure how important a feature like this would actually be for users.

@mzumsande mzumsande marked this pull request as ready for review April 26, 2022 10:27
@laanwj
Copy link
Member

laanwj commented Apr 26, 2022

Concept ACK

so I'm not sure how important a feature like this would actually be for users.

Right, I'm concerned about unexpected behavior. I don't think optimality is important in this case.

@DrahtBot
Copy link
Contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #23997 (wallet: avoid rescans under assumed-valid blocks by jamesob)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@laanwj laanwj added the Bug label Apr 26, 2022
@S3RK
Copy link
Contributor

S3RK commented Apr 26, 2022

Yes, that's true. I think it should be possible to call chainStateFlushed with a locator to the block that was last processed - I'll look into that. Of course it would save some time only in the rather rare situation of a clean shutdown during rescan (the bug must have existed for years but wasn't reported until very recently), so I'm not sure how important a feature like this would actually be for users.

The wallet already tracks last processed block, why can't we just write that form chainStateFlushed handler regardless of what locator was passed in. That way we don't need any bool flags I believe.

@mzumsande
Copy link
Contributor Author

The wallet already tracks last processed block, why can't we just write that form chainStateFlushed handler regardless of what locator was passed in. That way we don't need any bool flags I believe.

Not sure I understand this right - do you mean why locators are used in general instead of single block hashes? I think that's because we want to be flexible and not dependent on a specific block: There may be a reorg so a given block is no longer part of the chain when the wallet is loaded, or maybe the the wallet called the chainStateFlushed handler manually without the actual chainstate being flushed, and then an unclean shutdown happens so that the wallet would be ahead of the chain at next startup.

In any case, any block must be converted to a locator anyway to be stored in BESTBLOCK_NOMERKLE in order not to break the db format.

Also note that AttachChain() updates m_last_block_processed optimistically to the tip (before actually processing any missing blocks) and not incrementally. So if it gets interrupted early, m_last_block_processed is not correct.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 2052e3a. This is a straightforward fix for the bug described in #24487 where a wallet could skip scanning blocks if is shut down in the middle of a sync and a chainStateFlushed notification was received during the sync. It would be nice to write a test for this but probably would be tricky to write.

S3RK raises a great point in #24984 (review) that the wallet doesn't save the scan position during syncs, so if it is interrupted, parts of the scan could be repeated unnecessarily next time the wallet is loaded. This issue would be good to fix, and I created #25010 to track it, but I think it's basically a separate issue with a separate fix. This PR makes that issue more likely to happen, but in the cases it does this, current behavior of skipping unscanned blocks is worse than new behavior of rescanning already-scanned blocks, so in every case this PR should be a strict improvement over the status quo.

@achow101
Copy link
Member

ACK 2052e3a

@S3RK
Copy link
Contributor

S3RK commented Apr 28, 2022

Agree that "this PR should be a strict improvement over the status quo".

The only thing that still slightly worries me is two return false statements between setting and unsetting the m_attaching_chain flag. In both cases rescan fails, so I guess the behaviour is still correct, but it looks fragile.

@mzumsande
Copy link
Contributor Author

It would be nice to write a test for this but probably would be tricky to write.

Yes, I'd like that too. the problem is how to deal with the racing condition, timing the shutdown to be during the sync and not after. I played around a bit with bpftrace / uprobes after an idea by @jnewbery (e.g. hook into a function such as SyncTransaction() and send a SIGINT when it is invoked the 20th time) and that is really cool and seems to work, but obviously still very experimental and not ready for the CI.

The only thing that still slightly worries me is two return false statements between setting and unsetting the m_attaching_chain flag. In both cases rescan fails, so I guess the behaviour is still correct, but it looks fragile.

Yes, we currently rely on this to be the case. We aren't only concerned about accidental signals during the sync but also after it, caused by the interrupting signal: In case of a shutdown by SIGINT, Shutdown() in Init will create a chainStateFlushed signal which is processed by the wallet. At this point in time, m_attaching_chain must still be true or the bug is not fixed.

Copy link
Contributor

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review ACK 2052e3a

If I understand correctly, the problem is that at the end of CWallet::AttachChain(), CWallet::chainStateFlushed(...) is called passing the chain tip as parameter, even if abort or shutdown is requested during the wallet scan (CWallet::ScanForWalletTransactions).

I think this solution achieves the goal of preventing the wallet from being in an inconsistent state by forcing it to write the tip only after the scan completes.

Ideally there would be a functional test for this, but I don't see how to simulate this on regtest.

@achow101 achow101 merged commit 4cf9fa0 into bitcoin:master Apr 28, 2022
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Apr 29, 2022
…hile attaching chain

2052e3a wallet: ignore chainStateFlushed notifications while attaching chain (Martin Zumsande)

Pull request description:

  Fixes bitcoin#24487

  When a rescan is performed during `CWallet::AttachChain()` (e.g. when loading an old wallet) but this is interrupted by a shutdown signal, the wallet will currently stop the rescan, receive a `chainStateFlushed` signal, set the saved best block to the tip and shut down. At next startup, the rescan is not continued or repeated because of this. But some blocks have never been scanned by the wallet, which could lead to an incorrect balance.

  Fix this by ignoring `chainStateFlushed` notifications until the chain is attached. Since `CWallet::chainStateFlushed` is being manually called by `AttachChain()` anyway after finishing with the rescan, it is not a problem if intermediate notifications are ignored.

  Manual rescans started / aborted by the `rescanblockchain` / `abortrescan` RPCs are not affected by this.

  I didn't choose alternative ways of fixing this issue that would delay the validationinterface registration or change anything else about the handling of `blockConnected` signals for the reasons mentioned in [this existing comment](https://github.com/bitcoin/bitcoin/blob/master/src/wallet/wallet.cpp#L2937-L2944).

ACKs for top commit:
  achow101:
    ACK 2052e3a
  ryanofsky:
    Code review ACK 2052e3a. This is a straightforward fix for the bug described in bitcoin#24487 where a wallet could skip scanning blocks if is shut down in the middle of a sync and a chainStateFlushed notification was received during the sync. It would be nice to write a test for this but probably would be tricky to write.
  w0xlt:
    Code Review ACK bitcoin@2052e3a

Tree-SHA512: a6186173d72b26bd4adbf2315e11af365004a723ea5565a0f7b868584dc47c321a6572eafaeb2420bd21eed1c7ad92b47e6218c5eb72313a3c6bee58364e2247
achow101 added a commit that referenced this pull request May 16, 2022
…ing for signals

ba10b90 Wallet: Ensure m_attaching_chain is set before registering for signals (Luke Dashjr)

Pull request description:

  Avoids a race where chainStateFlushed could be called before rescanning began, yet rescan gets interrupted or fails

  Followup for #24984 avoiding a race between registering and setting the flag.

ACKs for top commit:
  mzumsande:
    Code Review ACK ba10b90
  achow101:
    ACK ba10b90

Tree-SHA512: 1d2fa2db189d3e87f2d0863cf2ab62166094436483f0da16760b1083a4743bf08e476a3277e0d36564213d65dd6f0a1fc16a4bf68d3338c991a14d1de9fc0fee
@mzumsande mzumsande deleted the 202204_wallet_rescan branch May 19, 2022 13:27
luke-jr pushed a commit to luke-jr/bitcoin that referenced this pull request May 21, 2022
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request May 28, 2022
…egistering for signals

ba10b90 Wallet: Ensure m_attaching_chain is set before registering for signals (Luke Dashjr)

Pull request description:

  Avoids a race where chainStateFlushed could be called before rescanning began, yet rescan gets interrupted or fails

  Followup for bitcoin#24984 avoiding a race between registering and setting the flag.

ACKs for top commit:
  mzumsande:
    Code Review ACK ba10b90
  achow101:
    ACK ba10b90

Tree-SHA512: 1d2fa2db189d3e87f2d0863cf2ab62166094436483f0da16760b1083a4743bf08e476a3277e0d36564213d65dd6f0a1fc16a4bf68d3338c991a14d1de9fc0fee
@bitcoin bitcoin locked and limited conversation to collaborators Jun 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatic wallet rescan skipped after abort
7 participants