Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further e2e fixes for reliability #198

Merged
merged 2 commits into from
Aug 19, 2024
Merged

Further e2e fixes for reliability #198

merged 2 commits into from
Aug 19, 2024

Conversation

ikatson
Copy link
Owner

@ikatson ikatson commented Aug 19, 2024

I found the root cause of e2e failures.

it was around this line debug!("nothing left to do, disconnecting peer");. The server was disconnecting the peer if the server itself had the full torrent.

However, the reason it worked at all, is that "peer_chunk_requester" was stuck in "wait_for_bitfield" forever if the peer never sent it in the first place. So it never reached the bugged line.

sequence:

  1. the server starts seeding
  2. first peers connect. The don't send bitfield because they have nothing.
  3. the server's "peer_chunk_requester" blocks forever in "wait_for_bitfield" and never reaches the line that disconnects it.
  4. (intentionally) bad test peers take too long, or send garbage etc
  5. the good peers run out of pieces to request and hit the "sleep 10s" line.
  6. the server's rwtimeout is set to 10 seconds also, and it disconnects good peers as they weren't doing anything for 10 seconds
  7. good peers reconnect. By that time they have already a bitfield to send.
  8. the moment they send the bitfield, the server hits the line "nothing to do" and disconnects the peer again.

@ikatson ikatson marked this pull request as ready for review August 19, 2024 15:39
@ikatson ikatson merged commit e3ab7e2 into main Aug 19, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant