Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor error reporting #6018

Closed
pjoc opened this issue Oct 19, 2021 · 5 comments
Closed

Poor error reporting #6018

pjoc opened this issue Oct 19, 2021 · 5 comments
Assignees
Labels
Milestone

Comments

@pjoc
Copy link

pjoc commented Oct 19, 2021

Have you checked borgbackup docs, FAQ, and open Github issues?

Yes

Is this a BUG / ISSUE report or a QUESTION?

Issue

System information. For client/server mode post info for both machines.

Intel x86_64

Your borg version (borg -V).

1.1.17

Operating system (distribution) and version.

Fedora 34

Hardware / network configuration, and filesystems used.

Local (not network), BTRFS on source and destination.

How much data is handled by borg?

1TB approx.

Full borg commandline that lead to the problem (leave away excludes and passwords)

borg --verbose prune --dry-run --keep-daily 7 /raid/Backups/Borg

Describe the problem you're observing.

Borg often produces a stack trace instead of an intelligible error message. Two recent examples:

  1. Running as normal user rather than root. Should just report the error directly.
  2. Running a regular backup when there's a problem with the cache. This took several days and a Fedora Bugzilla report to understand. Fixed by deleting the cache.

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

100% reproducible.

Include any warning/errors/backtraces from the system logs

Killed stale lock [email protected].
Removed stale exclusive roster lock for host Bree@88230071515343 pid 79358 thread 0.
Removed stale exclusive roster lock for host Bree@88230071515343 pid 79358 thread 0.
Local Exception
Traceback (most recent call last):
File "/usr/lib64/python3.9/site-packages/borg/archiver.py", line 4703, in main
exit_code = archiver.run(args)
File "/usr/lib64/python3.9/site-packages/borg/archiver.py", line 4635, in run
return set_ec(func(args))
File "/usr/lib64/python3.9/site-packages/borg/archiver.py", line 177, in wrapper
return method(self, args, repository=repository, **kwargs)
File "/usr/lib64/python3.9/site-packages/borg/archiver.py", line 1652, in do_prune
with Cache(repository, key, manifest, lock_wait=self.lock_wait) as cache:
File "/usr/lib64/python3.9/site-packages/borg/cache.py", line 387, in new
return local()
File "/usr/lib64/python3.9/site-packages/borg/cache.py", line 379, in local
return LocalCache(repository=repository, key=key, manifest=manifest, path=path, sync=sync,
File "/usr/lib64/python3.9/site-packages/borg/cache.py", line 464, in init
self.open()
File "/usr/lib64/python3.9/site-packages/borg/cache.py", line 512, in open
self.rollback()
File "/usr/lib64/python3.9/site-packages/borg/cache.py", line 629, in rollback
self._do_open()
File "/usr/lib64/python3.9/site-packages/borg/cache.py", line 500, in _do_open
with IntegrityCheckedFile(path=os.path.join(self.path, 'chunks'), write=False,
File "/usr/lib64/python3.9/site-packages/borg/crypto/file_integrity.py", line 129, in init
self.file_fd = override_fd or open(path, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/root/.cache/borg/fdef744d1c0e14810c75f47f54ffae8f5c4b1ae3bff6481f3fd2ed9a0c11913f/chunks'

Platform: Linux Bree 5.14.11-200.fc34.x86_64 #1 SMP Sun Oct 10 14:39:31 UTC 2021 x86_64
Linux: Unknown Linux
Borg: 1.1.17 Python: CPython 3.9.7 msgpack: 0.5.6.+borg1
PID: 86810 CWD: /root
sys.argv: ['/usr/bin/borg', '--verbose', 'prune', '--dry-run', '--keep-daily', '7', '/raid/Backups/Borg']
SSH_ORIGINAL_COMMAND: None

@ThomasWaldmann
Copy link
Member

borg can be run as root as well as as normal user, both have valid use cases.

at some places borg catches well-known exceptions with precisely known root causes and gives shorter / prettier error messages.

the above doesn't look that easy though.

in case of doubt, borg gives tracebacks to give as much information as possible - these are especially useful for developers to identify potential bugs (short error messages hiding most informations are not very helpful for this).

"no such file or directory" pointing to a cache file should be clear enough to point to a cache issue. the question here is why your borg cache is losing files and the answer might point to a more fundamental problem unrelated to borg.

automatically fixing such issues by just deleting something and recreating it might sound like a comfortable solution at first, but would likely hide the more fundamental problem.

@pjoc
Copy link
Author

pjoc commented Oct 19, 2021

No doubt borg can be run as root or a normal user. However when run as a normal user while trying to access the root it should state that there's a permissions error rather than dumping a stack trace, like pretty much every other Linux command.

The cache problem is obviously more complex and could have a number of explanations, but the actual error message complains about stale locks, followed by a stack trace. It's only by looking closely that one sees a "file not found" error. It took me a long time to understand that this wasn't an internal Borg bug, which is something I normally associate with stack traces. I don't expect Borg to automatically fix these things, but I would like to be told about them more clearly.

@jimparis
Copy link
Contributor

It may help to print the short form of the exception before the traceback, e.g.:

--- a/src/borg/archiver.py
+++ b/src/borg/archiver.py
@@ -4999,11 +4999,13 @@ def main():  # pragma: no cover
             tb = '\n'.join('Borg server: ' + l for l in e.sysinfo.splitlines())
             tb += "\n" + sysinfo()
             exit_code = EXIT_ERROR
-        except Exception:
+        except Exception as e:
             msg = 'Local Exception'
             msgid = 'Exception'
             tb_log_level = logging.ERROR
-            tb = '%s\n%s' % (traceback.format_exc(), sysinfo())
+            tb = '%s: %s\n' % (type(e).__qualname__, e)
+            tb += '\nIf reporting bugs, please include the following:\n'
+            tb += '%s\n%s' % (traceback.format_exc(), sysinfo())
             exit_code = EXIT_ERROR
         except KeyboardInterrupt:
             msg = 'Keyboard interrupt'

(maybe also integrated with other error printing paths around the same place)

@pjoc
Copy link
Author

pjoc commented Oct 20, 2021 via email

@ThomasWaldmann ThomasWaldmann added this to the 2.0.0b6 milestone Apr 7, 2023
@ThomasWaldmann ThomasWaldmann self-assigned this Apr 7, 2023
ThomasWaldmann added a commit to ThomasWaldmann/borg that referenced this issue Apr 7, 2023
as a first step, i moved all the traceback formatting
to format_tb.

also, it **first** prints the error and then the traceback
as additional information for a bug report, as suggested
by @jimparis in that ticket.
ThomasWaldmann added a commit that referenced this issue Apr 15, 2023
…ster

refactor toplevel exception handling, see #6018
@ThomasWaldmann
Copy link
Member

Maybe #7513 fixes this, but guess it will need some testing in practice.

Just reopen a new ticket if it doesn't.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants