Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Borg: Not a Valid Repository: Files Missing (e.g., Config, Hints, Index) (retry) #6865

Closed
raywood1 opened this issue Jul 16, 2022 · 14 comments
Closed
Labels

Comments

@raywood1
Copy link

raywood1 commented Jul 16, 2022

As advised in the subReddit, I am reposting my issue here, as follows:


I have created a Borg backup in a location like /media/veracrypt1/2022-01-01/BorgRepository. Normally, in that kind of folder, Borg would create files with names like config, hints.1523, index.1523, and integrity.1523, along with a data subfolder containing the Borg archive files, with names like 1523.

When I created a backup of this folder, I neglected to include those top-level files (e.g., hints.1523). Now I have restored that backup. That folder now contains only the data subfolder.

I have tried to inspect the contents of that repository using sudo borg list /media/veracrypt1/2022-01-01/BorgRepository. The result is an error: "/media/veracrypt1/2022-01-01/BorgRepository is not a valid repository."

Is it possible to reconstruct those missing top-level files, so as to make this a working repository?


Advice in response to that post pointed me toward an issue here in GitHub. Following the procedure suggested there, I created a README and a config file at the specified location. Then I ran sudo borg check --repair --progress /media/veracrypt1/2022-01-01/BorgRepository. That produced this output:

Exception ignored in: <function Repository._del_ at 0x7fa7faac28c0>
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/borg/repository.py", line 190, in del
    assert False, "cleanup happened in Repository.del"
AssertionError: cleanup happened in Repository.del
Local Exception
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 5089, in main
    exit_code = archiver.run(args)
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 5020, in run
    return set_ec(func(args))
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 168, in wrapper
    with repository:
  File "/usr/lib/python3/dist-packages/borg/repository.py", line 200, in enter
    self.open(self.path, bool(self.exclusive), lock_wait=self.lock_wait, lock=self.do_lock)
  File "/usr/lib/python3/dist-packages/borg/repository.py", line 456, in open
    self.id = unhexlify(self.config.get('repository', 'id').strip())
binascii.Error: Odd-length string

Platform: Linux UbuntuVM 5.15.0-41-generic #44-Ubuntu SMP Wed Jun 22 14:20:53 UTC 2022 x86_64
Linux: Unknown Linux  
Borg: 1.2.0  Python: CPython 3.10.4 msgpack: 1.0.3 fuse: pyfuse3 3.2.0 [pyfuse3,llfuse]
PID: 14279  CWD: /home/ray
sys.argv: ['/usr/bin/borg', 'check', '--repair', '--progress', '/media/veracrypt1/2022-01-01/BorgRepository']
SSH_ORIGINAL_COMMAND: None

This is Ubuntu 22.04 LTS, booted from a USB drive. I've tried rebooting with a different USB; it yields roughly the same result. Both of these have successfully run Borg commands in the past.

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Jul 16, 2022

So, did you use borg encryption for that repo? If so, do you still have (a backup of) the key?

The unhexlify traceback means that the repo id in the config does not have the correct length (256 bit == 64 hex digits).

@raywood1
Copy link
Author

raywood1 commented Jul 16, 2022

Sorry, forgot to answer that: I didn't use Borg encryption.

I should clarify: the other issue tells me to put this in the config file:

[repository]
version = 1
segments_per_dir = 10000 
max_segment_size = 5242880
append_only = 0
id = <some hex copied from key file>

So that's literally, exactly what I put. The last line invites user input, so I changed it to id = 00001. Later, I realized that the first line was probably supposed to be something other than [repository], but I wasn't sure exactly what. In this case, the name of the repository is BorgRepository, so maybe I should have used either BorgRepository or [BorgRepository] - but the machine isn't available right now, so I can't test it.

@ThomasWaldmann
Copy link
Member

The id is not an arbitrary string, but it must be exactly 64 hex digits.

Have a look at a valid config file you can get by just running borg init -e none my-tmp-repo.

The first line is a constant, not a repository name.

@ThomasWaldmann
Copy link
Member

And you're lucky you did not use repokey encryption - if you lose the config with the encryption key, you can not recover anything inside this repo (that's the reason why borg init strongly recommends to make a key backup).

@raywood1
Copy link
Author

I looked at the config file used in a couple other Borg archives. Those have max_segment_size = 524288000 (with those two extra zeros at the end). They also have two items not appearing above: storage_quota = 0 and additional_free_space = 0.

I copied one of those config files to this archive that doesn't have one, changing only the final digit in the ID number. Then I re-ran the command shown above. That worked: I was able to proceed with the borg check operation.

Unfortunately, that operation generated many errors indicating "New missing file chunk detected" and "Replacing with all-zero chunk." It generated ten new files at the end of the set, mostly full-sized (i.e., ~500 GiB), with names (i.e., numbers) that would conflict with files at the start of the next archive.

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Jul 17, 2022

Missing file chunks means that your repo/data/ directory copy was corrupted.

Not sure what you mean with your last sentence.

@raywood1
Copy link
Author

I was afraid of that.

The last sentence was supposed to mean that Archive C originally ended with files with names like 781, 782, and 783, but the borg check process added another ten files, with names like 784, 785 ... 792, 793.

I assume it felt free to do that because these files were restored from backup. As far as Borg knew, that was the end of the repository. In reality, however, the next archive in the sequence (Archive D) started with something like 786, 787, 788 ... So now I couldn't restore Archive D's files to the same folder as the files for Archive C. Doing so would overwrite some of these newly added files for Archive C.

The actual numbering was not as regular as that, but the problem of filename duplication was there nonetheless.

The solution would perhaps be to make sure to restore all archives from backup before attempting borg check on any of them. Perhaps then the newly created additions would be created at the end of the final set, with names like 935, 936 ...

But that wouldn't change the apparent fact that the Borg archive was borked.

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Jul 17, 2022

The distribution of archive chunks over segment files is not strictly linear, it is not as simple as you think it is.

What you should do is start from a full/complete copy of a valid (or "as good as it gets") state of the data directory and then run borg check --repair on that repo. And better do a borg delete --cache-only repo to invalidate all caches it has built by your last try.

@raywood1
Copy link
Author

I didn't mean to be commenting on the distribution of archive chunks. I was talking only about the numbered Borg archive files. Viewed in Windows, the numbered filenames conflict, due to the addition of new files at the end of Archive C. Those new files, created by borg check, have the same names as some of the first files in Archive D.

I burned the backup to Blu-ray, one archive at a time. So the BD-R discs contain a folder with the Borg archive files for Archive A (say, 1-229), and then another folder for Archive B (say, 249-523), and so forth. I restored Archives A through C from BD-R, then ran borg check on Archive C. That process created new files at what it believed to be the end of the repository (e.g., 784, 785 ... ), when in fact those numbers were already claimed by Archive D (which I hadn't yet restored from BD-R).

Don't worry, I can make this more confusing if necessary. Just give me time ...

@ThomasWaldmann
Copy link
Member

Can you please read our docs about the terms "archive", "segment files" and "repository"?

You seem to use these terms differently and that's quite confusing.

@ThomasWaldmann
Copy link
Member

Guess this is solved?

@NEVARLeVrai
Copy link

hello, my bor.config file is missing but i have the necryption key how to reconstruct it ?

@ThomasWaldmann
Copy link
Member

@NEVARLeVrai you can create a temporary repository of the same type (repokey or keyfile) at some other place and use that config as a template.

You can copy the repo id and the key material from your key backup.

@NEVARLeVrai
Copy link

@NEVARLeVrai you can create a temporary repository of the same type (repokey or keyfile) at some other place and use that config as a template.

You can copy the repo id and the key material from your key backup.

It's okay, I found the config file with recuva it's checking the integrity it's very very slow idk if it will work, I have 700gb of data and it's on an HDD,

ANYWAYS thx for helping me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants