-
-
Notifications
You must be signed in to change notification settings - Fork 758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix deadlock #1371
fix deadlock #1371
Conversation
Current coverage is 82.20% (diff: 89.18%)
@@ 1.0-maint #1371 diff @@
===========================================
Files 14 14
Lines 4433 4440 +7
Methods 0 0
Messages 0 0
Branches 776 777 +1
===========================================
+ Hits 3638 3650 +12
+ Misses 569 566 -3
+ Partials 226 224 -2
|
2c0333e
to
75a9378
Compare
From IRC discussion:
|
75a9378
to
972ffc2
Compare
any comments about the |
972ffc2
to
e37d497
Compare
please test / review, this is the last essential issue to do for 1.0.7 milestone. |
I'll review this offline |
self.id = self.call('open', self.location.path, create, lock_wait, lock, append_only) | ||
except self.RPCError as err: | ||
if err.remote_type == 'TypeError': | ||
raise self.NoAppendOnlyOnServer() from err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove this? Silently ignoring Repository settings on create doesn‘t sound nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's now either both exclusive and append_only params (new api) or neither of them (old api).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what i meant was that the old api code path should do:
if append_only:
raise self.NoAppendOnlyOnServer()
Thus keeping the safe guard from the original PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, see new commit.
with self.repository: | ||
self.assert_raises(LockFailed, lambda: len(self.repository)) | ||
upgrade.assert_called_once_with() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove this? I think this case is still relevant? i.e. we still need to upgrade, and we still need not to hit any asserts or other problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't make much sense of this test.
I first completely removed the lock upgrading code.
Later I re-added it for older clients.
But the current code does not do lock upgrades.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it still does.
- Do a borg list with a remote repo that had it’s last writer crash and no other access since then
- client eventually calls Repository.get
- which calls Repository.get_transaction_id()
- which checks for crash by index_transaction_id != segments_transaction_id
- and calls Repository.replay_segments
- which calls self.prepare_txn
- which needs to upgrade
This is the one legitimate case i can come up where an always read only operation (that correctly uses an non exclusive lock) can get into prepare_txn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now accounted for in my understanding.
Apart from the above this looks good. Just noticed though that change-passphrase and migrate-to-repokey use Repository.save_key which is not covered by any lock. Not sure we care enough to make a separate ticket for that though. |
@textshell please verify if the last changeset is what you meant. |
Yes, that‘s what i meant. Can you please resurrect the test (or similar) for the lock upgrade on replay? |
b09c12b
to
9967cc4
Compare
Test resurrected and differentiated into 2 tests for old and new clients. |
I see the last bit was already addressed :) |
@enkore guess the most delicate part is the compatibility. |
about lock upgrade/downgrade: I wanted to deprecate that, due to the obvious deadlock issues, but it seems that even with the new code we need the lock upgrade for somehow unexpectedly needing a write lock, e.g. when borg list finds a crashed repo and needs to do segment replay. So I think I am removing the deprecation notes and replacing them with a deadlock warning. |
if err.remote_type != 'TypeError': | ||
raise | ||
if append_only: | ||
raise self.NoAppendOnlyOnServer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k
collapse / merge? |
if we don't, we try to upgrade the lock. this is to support old clients talking to a new server and also to avoid bad consequences from coding mistakes for new clients.
yes |
wrt collapse: I would at least keep the first two commits uncollapsed. Renames are alsways better in a separate commit. looks good now |
lock upgrading is troublesome / may deadlock, do not advertise it.
2e891de
to
abebedc
Compare
…eplay old clients use self.exclusive = None and do a read->write lock upgrade when needed. new clients use self.exclusive = True/False and never upgrade. replay fakes an old client by setting self.exclusive = None to get a lock upgrade if needed.
abebedc
to
33e3348
Compare
see #1220 + commit comments.
The deadlock is fixed already.
But: needed a RPC API change to add the exclusive param. :-(
Also: Thinking about whether we maybe rather want a safe default for exclusive.
Currently it is usually False and you need to give exclusive=True - guess the opposite way would be safer, so one does not forget requesting an exclusive lock.
Nothing really bad would happen if one forgets though, the code has an automatic lock upgrade to an exclusive lock.
The lock upgrade/downgrade code was kept to support old clients.
Current code shall not use it, it is deprecated.