-
-
Notifications
You must be signed in to change notification settings - Fork 205
[BUG] Scrape or some other routine won't exit and runs infinitely #791
Comments
Aware this happens, unsure of the cause. No time currently to figure it out. There also isn't a reliable method to recreate in a small timeframe. |
thanks. In my case, I have 100% repro. (Although it recently changed from "90 links" to "20 links" in queue. But it's 20 links all the time, and it's been like this for a week or so.) I can capture any sort of trace or other diagnostic info if that helps. |
Just tried it on a different thread, and it works as expected. (I.e. the app eventually exists.) So, it might be something wrong with this particular forum thread. |
I think you've made some great progress there. With one of the recent updates, it no longer hangs. Although it now exist prematurely with a 403 error from the forum. I suspect that might be where it hanged previously.
which is weird becuase there is a bunch of successful activity in between. And I still can access the forum in the browser. And this only happens to me when downloading some very long threads, such as anyhow, the original problem seems to be solved. Do you want me to open a new issue or continue here to discuss how we could potentially addess those intermittent 403's? |
You are right about the ratelimiting, however the hangs are still a problem. I have yet to successfully find out what it is yet, need to dedicate more time. I'll deal with the ratelimiting soon, no need to make an issue for it. |
@baccccccc, are you running things on windows? |
yes, Windows. hold on, I think I might have figured it out. I noticed that there is a small bunch of the following errors in
What captured my attention is that those URLs never make it to I was under the impression you can blocklist individual files in the config but now I could not find this option. So, to try it out, I just blocked the entire And guess what, the next download attempt finished successfully. No more hangs! |
I'm unsure if it's as simple as one unhandled exception. But I'll try it. After I figure out why the program won't exit for me.. |
It must be it. While I had the
Each was retried a number of times, as configured for And more importantly, three corrsponding records made it to Subsequently, it all finished with the following stats.
The whole pass, from start to finish took roughly 12 minutes, and Now that I unblocked the domain again and restarted the download, it hangs at this point.
It's been 20 minutes since that record, and no visible progress has been made. There are only two error records about
and
and then
with no furhter info (e.g. errors or successes for this URL.) But there's nothing mentioning |
So @baccccccc, you are correct in that URLs failing in that manner aren't added to the failed download log file, however, they aren't hanging up the program. Unsure why it started working for you after blocking that domain. I'm wondering if some failure types are leaving client sessions open or something of that nature. I'm going to try some of the above threads with my hyper logged version and see if it can give me any insights. |
To make it more fun, none of the links @nuvibes suggested above freeze for me. At least not actually freeze. Some bunkr files that fail out on retries, but that's about it for waits. |
@baccccccc I figured it out. Round about way to get there though. I should have an update coming later today for you to test. |
5.1.73 just went up, that should solve this. If it doesn't feel free to reopen the issue. |
thanks! Running it right now. Is it expected that it's now constantly flickering around the top of the window, saying something about file locks? |
Damn. No, I'll sort that. |
5.1.74 should solve that. |
ok, looks like the hangs are fixed indeed, yay! now gotta rant about ddos-guard on bunkr. |
I'm trying to download
https://simpcity.su/threads/cj-miles.7954
for quite some time. The command line I'm typically using is--output-folder <bla> --log-folder <bla> --ignore-history --download https://simpcity.su/threads/cj-miles.7954
Note 1. That the same syntax worked flawlessly with other forum threads before.
Note 2. I've tried removing
--ignore-history
and the outcome is slightly different, but the main problem seems to be the same.Basically, it seems that the program reaches approximately 2095 files (sometimes a bit fewer due to transient errors that don't seem to impact anything in the grand scheme of things) and then hangs indefinitely. It's still responsive, i.e. it keeps flickering, resizes (if I resize the console window) and responds to
Ctrl+C
. But there are seemingly no ongoing downloads, the scrape won't finish (even if I leave it like this for a day or longer), no new entries indownloader.log
and the file counter won't increase anymore.Now here's a minor difference. If I omit
--ignore-history
then the scraping part of the screen becomes completely empty.If I include
--ignore-history
then the scraping part always shows... and 90 links in Scrape queue
. Of course, there are other (meaningful) messages before that. But once it reaches... and 90 links
then I know it went to this "hang" state and it stays like this forever, and nothing changes anymore.I've retried multiple times (probably couple dozen times already in the last two weeks or so), trying to update the program before each run. The results are very consistent.
Thanks in advance!
The text was updated successfully, but these errors were encountered: