Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unstable test tests::e2e::test_e2e_download #194

Closed
izderadicka opened this issue Aug 18, 2024 · 8 comments
Closed

Unstable test tests::e2e::test_e2e_download #194

izderadicka opened this issue Aug 18, 2024 · 8 comments

Comments

@izderadicka
Copy link
Contributor

This test seems to be unstable in some runs it stucks for minutes, then ended with this error:

thread 'tests::e2e::test_e2e_download' panicked at crates/librqbit/src/tests/e2e.rs:220:27:
called `Result::unwrap()` on an `Err` value: Elapsed(())

so it looks on timeout to finish torrent download ...

When running with logging enabled, looks like it stucks early in download:

2024-08-18T10:33:12.915377Z  INFO stats_printer: librqbit::tests::e2e: progress_percent="0.10%" peers="AggregatePeerStats { queued: 0, connecting: 0, live: 0, seen: 1, dead: 1, not_needed: 0, steals: 0 }"

Below is log from one failed test (failure rate seems to be something like 30% or so, so it was relatively easy to replicate)
test_log.txt.zip

@ikatson
Copy link
Owner

ikatson commented Aug 18, 2024

#195

@ikatson ikatson closed this as completed Aug 18, 2024
@ikatson
Copy link
Owner

ikatson commented Aug 18, 2024

The problem was in listen ports racing with each other and no "good" server being able to listen. Some servers are intentionally "bad" for chaos / recovery testing. As visible from your log there was only 1 out of 128 servers that was able to spin up

@ikatson ikatson reopened this Aug 18, 2024
@ikatson
Copy link
Owner

ikatson commented Aug 18, 2024

Hmm, no it still bugged out on continuous testing, looking again

@ikatson
Copy link
Owner

ikatson commented Aug 18, 2024

Merged another fix in #196, let's monitor

@ikatson
Copy link
Owner

ikatson commented Aug 18, 2024

Tried a few more times, it's working much better, however there's still a rare issue where it couldn't download to the end, but it wasn't exposed previously. Probably related to #149

@izderadicka
Copy link
Contributor Author

I've tested latest main, problem still appears, but less often, say around 10% of runs, behavior, while previously here were only 1 live connection, now there is plenty of connected peers, but they disconnect before torrent download is finished, see this extract from log:

2024-08-18T19:26:43.770000+0200  INFO stats_printer: librqbit::tests::e2e: progress_percent="89.30%" peers="AggregatePeerStats { queued: 0, connecting: 1, live: 27, seen: 128, dead: 100, not_needed: 0, steals: 144 }"                                    ┤
│2024-08-18T19:26:43.877000+0200  INFO stats_printer: librqbit::tests::e2e: progress_percent="89.30%" peers="AggregatePeerStats { queued: 1, connecting: 0, live: 27, seen: 128, dead: 100, not_needed: 0, steals: 144 }"                                    ┤
│2024-08-18T19:26:43.957000+0200  INFO stats_printer: librqbit::tests::e2e: progress_percent="89.30%" peers="AggregatePeerStats { queued: 0, connecting: 0, live: 18, seen: 128, dead: 110, not_needed: 0, steals: 145 }"                                    ┤
│2024-08-18T19:26:44.081000+0200  INFO stats_printer: librqbit::tests::e2e: progress_percent="89.30%" peers="AggregatePeerStats { queued: 0, connecting: 0, live: 15, seen: 128, dead: 113, not_needed: 0, steals: 146 }"                                    ┤
│2024-08-18T19:26:44.159000+0200  INFO stats_printer: librqbit::tests::e2e: progress_percent="89.30%" peers="AggregatePeerStats { queued: 0, connecting: 0, live: 11, seen: 128, dead: 117, not_needed: 0, steals: 147 }"                                    ┤
│2024-08-18T19:26:44.264000+0200  INFO stats_printer: librqbit::tests::e2e: progress_percent="89.30%" peers="AggregatePeerStats { queued: 0, connecting: 0, live: 12, seen: 128, dead: 116, not_needed: 0, steals: 147 }"                                    ┤
│2024-08-18T19:26:44.350000+0200  INFO stats_printer: librqbit::tests::e2e: progress_percent="89.30%" peers="AggregatePeerStats { queued: 0, connecting: 0, live: 4, seen: 128, dead: 124, not_needed: 0, steals: 148 }"                                     ┤
│2024-08-18T19:26:44.449000+0200  INFO stats_printer: librqbit::tests::e2e: progress_percent="89.30%" peers="AggregatePeerStats { queued: 0, connecting: 0, live: 3, seen: 128, dead: 125, not_needed: 0, steals: 148 }"                                     ┤
│2024-08-18T19:26:44.548000+0200  INFO stats_printer: librqbit::tests::e2e: progress_percent="89.30%" peers="AggregatePeerStats { queued: 0, connecting: 0, live: 0, seen: 128, dead: 128, not_needed: 0, steals: 148 }" 

Here is full log:

test_log2.txt.zip

Here is a script to replicate test failure:

#!/bin/bash

TEST_NO=$1

for i in {1..100}; do
echo -n Running test $i
START=$SECONDS
RUST_LOG=debug RUST_BACKTRACE=1 cargo test -- test_e2e_download --nocapture > /tmp/test_log${TEST_NO}.txt 2>&1
FAIL_CODE=$?
if [ $FAIL_CODE -ne 0 ]; then
    echo " - Failed in $(( SECONDS - START )) secs with code $FAIL_CODE"
    exit $FAIL_CODE
    fi
echo " - Finished in $(( SECONDS - START )) secs"
done

@ikatson
Copy link
Owner

ikatson commented Aug 18, 2024

Yeah that was the one I mentioned.
The script is almost the same one I used lol :)

I used this one to run until it fails

set -o pipefail
while RUST_LOG=debug cargo test test_e2e_download -- --nocapture 2>&1 | tee /tmp/test-log; do :; done

@ikatson
Copy link
Owner

ikatson commented Aug 19, 2024

I've merged a bunch of changes, including fixing this. It required changing some core logic, so new bugs might show up. But I think I nailed down why this particular issue was happening, described here in detail #198.

So closing this for now, feel free to reopen / open other ones if new bugs show up

@ikatson ikatson closed this as completed Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants