Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can borg repair broken repository chunks #148

Closed
tgharold opened this issue Aug 11, 2015 · 7 comments
Closed

Can borg repair broken repository chunks #148

tgharold opened this issue Aug 11, 2015 · 7 comments
Assignees

Comments

@tgharold
Copy link
Contributor

In the case of bit-rot in the repository where a chunk no longer matches its hash value, is it possible for the borg client to upload a new chunk to replace/fix the broken chunk.

It would not help in the case where a file has changed over time and the chunk came from an older version of the file. But for files that are slow to change, repairing the chunk in the repository would keep older archives from being broken. So the usefulness would depend on how fast your data set changes, and whether the bit-rot affected a newer chunk or a really ancient chunk.

@tve
Copy link

tve commented Dec 12, 2015

Duplicity can use par2 error correction files in order to repair broken and even missing chunks of old backup files. I think that something like that would be even more important with borg because of the deduplication. With duplicity one pretty much ends up with multiple copies of files due to the full-incremental scheme, which borg avoids. Despite being pretty careful, I have had bit rot happen in my backups due to HD failure, rsync network errors, and other accidents. If I could dial-in a % overhead to add some level of redundancy then I'd do that.

@RonnyPfannschmidt
Copy link
Contributor

my understandign is, that all that is needed is force-writing said chunk,
then in the segments it is repressented as delete/update pair and would replace the corrupt data completely on the next vacuum,

this needs a implementation

@ThomasWaldmann
Copy link
Member

please continue discussion about adding redundancy there: #225
and stay here on topic, see first post.

@ThomasWaldmann
Copy link
Member

From the mailing list:

On 06.07.2016 17:57, Tomasz Melcer wrote:
> On 06.07.2016 17:44, Marian Beermann wrote:
>> The data files is another story. There's no forward error correction in
>> Borg itself, so errors can be detected but only some minor errors can be
>> corrected. "borg check --repair" will replace corrupted data chunks with
>> runs of zeroes of the same length, while saying where it did that.
>> Corrupted commit tags can take some more data with them to nirvana.
>
> So, please correct me if I'm wrong:
>
> Let say there was a bad block inside one of the data files. After
> recovery I can just run `borg check --repair`, and while some of the
> data will be lost, other chunks will still be there. Therefore, I can
> fully recover all past backups except for the chunk(s) hit by the bad
> blocks.

Well it always depends on what block is exactly hit and how.

You may lose a lot of FS structure through one unlucky bad block
equating to tons of data loss, or maybe just a data block somewhere.
If it's only in the data, then check--repair has an easy job and it'd be
really only that block, if it also hit's structure metadata more chunks
in the same file may be lost.
If it hit's a commit more may be lost.
If it hit's the one chunk you're interested in right now you have a
problem and so on.
If it hits the metadata of an archive the archive may loose some files
upon repair or may be un-repairable, but no data *per se* would be lost.

But in principle, yes, and from a pure statistics view the likelihood of
data itself being affected instead of metadata is high for a single block.

>
> One more question. If the next backup will happen to have the same
> chunk, will it be added to the backup, filling the missing part for
> older backups? In some scenarios I find it likely that a possible bad
> block could just hit a chunk of one of the files that are still
> available on the live system.
>

Yes, but the old backup archive will still have a run of zeroes in it.

To explain why, a rough sketch how data storage works in Borg (also
explained in the internals docs in more detail). A file item has a chunk
ID list, which lists the chunks containing the file data in order (and
some metadata). If check finds that one of these is gone, then it
creates a *new* chunk, same size as the corrupted one, made from zeroes,
stores it, and *edits* the chunk ID list of the affected file to refer
to the zeroes chunk instead.

If you do a new backup after that, then the corrupted chunk simply won't
be in the repository any more and will be newly stored by the backup.
But since creating a new backup doesn't touch old archives the old
archive will still have a run of zeroes there.

Arguably this could be improved (e.g. when repairing a file, edit the
chunk ID list, but also store how it was edited - when the affected
chunks are stored back in the repository these files may be healed
"retrospectively" then.)

Cheers, Marian

@ThomasWaldmann
Copy link
Member

How about adding a "chunks_healthy" key to the item metadata dict with a copy of the original (correct, but partly unavailable) chunk ids in case we choose to "repair" it (by replacing some chunks with all-zero chunks and modifying the "chunks" key with these replacement IDs)?

The copy into the chunks_healthy key would be only made if there is no such key already - so that we do not lose information in case such a repair happens multiple times.

A healing operation could then check for the chunks_healthy key, if it is present, compute the set difference with chunks and check if all the missing chunks are in the repo NOW, if so, copy chunks_healthy value into chunks and remove chunks_healthy key.

@ThomasWaldmann
Copy link
Member

1.0.4 will already remember the good/original chunk IDs.
Healing functionality based on that is still TODO.

@ThomasWaldmann ThomasWaldmann removed their assignment Jul 7, 2016
@ThomasWaldmann ThomasWaldmann modified the milestones: 1.0.6 fixes for 1.0-maint, 1.0.5 fixes for 1.0-maint Jul 7, 2016
@enkore enkore modified the milestones: 1.0.6 fixes, 1.0.7 fixes Jul 8, 2016
@ThomasWaldmann ThomasWaldmann self-assigned this Jul 9, 2016
@ThomasWaldmann ThomasWaldmann modified the milestones: 1.0.6 fixes, 1.0.7 fixes Jul 9, 2016
ThomasWaldmann added a commit to ThomasWaldmann/borg that referenced this issue Jul 9, 2016
also: improve logging for archive check
ThomasWaldmann added a commit to ThomasWaldmann/borg that referenced this issue Jul 9, 2016
also: improve logging for archive check
ThomasWaldmann added a commit to ThomasWaldmann/borg that referenced this issue Jul 9, 2016
also: improve logging for archive check
@enkore
Copy link
Contributor

enkore commented Jul 10, 2016

Fixed in #1300

@enkore enkore closed this as completed Jul 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants