Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-deterministic hanging test in tendermint_p2p::secret_connection #841

Closed
thanethomson opened this issue Mar 25, 2021 · 1 comment · Fixed by #853
Closed

Non-deterministic hanging test in tendermint_p2p::secret_connection #841

thanethomson opened this issue Mar 25, 2021 · 1 comment · Fixed by #853
Assignees
Labels
bug Something isn't working p2p Peer-to-peer networking

Comments

@thanethomson
Copy link
Contributor

I was running some tests today (on master) and found one of the P2P tests hanging indefinitely:

fn test_read_write_single_message() {

Not sure if this is caused by a problem in the pipe crate or by the non-deterministic nature of who reads/writes first from/to the bidirectional buffer when spawning the write and read threads in the test.

cc @melekes @xla

Steps to reproduce

It's not guaranteed to reproduce the issue, but if I run the following around 5-10 times the test usually hangs at least once:

cargo test -p tendermint-p2p --all-features

What's the definition of "done" for this issue?

When we can reliably execute the test without it hanging.

@thanethomson thanethomson added bug Something isn't working p2p Peer-to-peer networking labels Mar 25, 2021
@melekes
Copy link
Contributor

melekes commented Mar 31, 2021

Sorry about this. I thought I could use pipe crate (a a synchronous memory pipe with buffered writer; https://docs.rs/pipe/0.4.0/pipe/) with our tests, but maybe it's not possible. Our implementation (specifically share_* functions

// TODO(ismail): on the go side this is done in parallel, here we do send and receive after
) rely on async transport.

melekes added a commit that referenced this issue Mar 31, 2021
Closes #841

I had to clone the original crate to introduce 4 additional methods:

1. async_pipe
2. async_pipe_buffered
3. async_bipipe
4. async_bipipe_buffered

The difference is they all use unbounded `crossbeam` channel rather then
`bounded(0)` which makes them async.

And we need pipe to be async because of how the handshake is written
currently.
@melekes melekes mentioned this issue Mar 31, 2021
5 tasks
@thanethomson thanethomson added this to the Seed Node milestone Apr 7, 2021
melekes added a commit that referenced this issue Apr 14, 2021
Closes #841

A copy of the `pipe` crate (https://github.com/arcnmx/pipe-rs) is added to the `tendermint-p2p` crate. I had to add a new method - `async_bipipe_buffered` to make the handshake tests always pass. `async_bipipe_buffered` method uses the `unbounded` channel (rather than the blocking one) for communication - hence the prefix. Note I also removed the methods and structures from `pipe` clone that we're not using at the moment.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working p2p Peer-to-peer networking
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants