-
Notifications
You must be signed in to change notification settings - Fork 385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix P2P Deadlocks #957
Fix P2P Deadlocks #957
Conversation
Codecov Report
@@ Coverage Diff @@
## main #957 +/- ##
==========================================
+ Coverage 90.62% 90.66% +0.03%
==========================================
Files 60 60
Lines 30409 30407 -2
==========================================
+ Hits 27558 27568 +10
+ Misses 2851 2839 -12
Continue to review full report at Codecov.
|
the first commit message says:
and
but we should actually encourage the user to use a unique u64, like the lightning-net-tokio implementation of the descriptor does. i.e. discourage the use of a potentially non-unique ID such as an fd. We might even go as far as having |
Hmm, I mean if you're writing it in C you can just use There's definitely nothing wrong with using an fd, and it makes writing to this api super-duper trivial in C-like languages, the only reason we don't use the fd in tokio is its kinda awkward to actually fetch the fd from the TcpStream. |
455a733
to
9e49062
Compare
Rebased on upstream and added a commit to not recommend file descriptor use directly while still calling out the requirements if you do. |
The only practical way to meet this requirement is to block disconnect_socket until any pending events are fully processed, leading to this trivial deadlock: * Thread 1: select() woken up due to a read event * Thread 2: Event processing causes a disconnect_socket call to fire while the PeerManager lock is held. * Thread 2: disconnect_socket blocks until the read event in thread 1 completes. * Thread 1: bytes are read from the socket and PeerManager::read_event is called, waiting on the lock still held by thread 2. There isn't a trivial way to address this deadlock without simply making the final read_event call return immediately, which we do here. This also implies that users can freely call event methods after disconnect_socket, but only so far as the socket descriptor is different from any later socket descriptor (ie until the file descriptor is re-used).
9e49062
to
c5322f7
Compare
Rebased now that #948 landed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice clean-up on the documentation! Left a bunch of nits on that commit. As documentation grows in size, it would be advantageous to give section headings to break it up for the user, making it easier to scan.
lightning/src/ln/peer_handler.rs
Outdated
/// be careful to ensure you don't have races whereby you might register a new connection with an | ||
/// fd which is the same as a previous one which has yet to be removed via | ||
/// PeerManager::socket_disconnected(). | ||
/// If applicable in your language, you probably want to just extend an int and put a file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/probably want to just/can
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc was rewritten in a later commit.
lightning/src/ln/peer_handler.rs
Outdated
/// be careful to ensure you don't have races whereby you might register a new connection with an | ||
/// fd which is the same as a previous one which has yet to be removed via | ||
/// PeerManager::socket_disconnected(). | ||
/// If applicable in your language, you probably want to just extend an int and put a file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/probably want to just/can
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc was rewritten in a later commit.
lightning/src/ln/peer_handler.rs
Outdated
/// If applicable in your language, you probably want to just extend an int and put a file | ||
/// descriptor in a struct and implement this trait on it. Note, of course, that if you do so and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/and put ... and implement/by wrapping ... and implementing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc was rewritten in a later commit.
4cc6682
to
8e2c6be
Compare
See the previous commit for more information.
There are various typo and grammatical fixes here, as well as concrete updates to correctness.
8e2c6be
to
05157b1
Compare
Squashed with no diff:
Diff from Val's ACK (all pretty trivial grammar and wording changes, so will merge after CI passes):
|
Based on #948 this drops the practically-impossible requirements set by peer handler, see the first commit for more details. This should fix the immediate issue in #951 somewhat as a side-effect, while also fixing much more rare races as well.