-
-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Several new ZIMs appearing in the zimit directory on download.kiwix.org are too small #168
Comments
NB |
@RavanJAltaie @Popolechien We should:
@Jaifroid Thank you for the bug report! |
I've checked the files below in the library and they were working properly: 1-https://master.download.kiwix.org/zim/zimit/coopmaths_2023-01.zim |
I investigated https://farm.openzim.org/recipes/raspberrypi_docs |
Thanks @RavanJAltaie for testing those properly. I had only managed to test a few, and the others were guesses based on anomalous size. Good to have it confirmed. Strange that some recipes that used to work fine are now broken. |
Deleted the ZIMs |
So from the discussion I understand that the only failing ZIM that used to work is
This would be a crawler bug. Reporting upstream |
It would be interesting to know if the early completion is the cause with the other ZIMs that only seemed to scrape a single page or only a very small part of the crawl. I.e., is it the same issue, or a different one? |
Neither footballdatabase nor blockygames had errors. |
OK, so maybe we could put it down to some problematic recipes for new, untested sites, and a fluke early completion for a known site? The January ready.gov (Spanish) seems to have worked fine, plus coopmaths and raspberry pi. Feel free to close the issue and we can re-open if other known good sites fail. |
Here is a list of suspiciously small ZIMs in descending order of date (ignore
www.ready.gov
, which seems OK, andliberius
). Most small ones I've tested have ended up being just one-page scrapes. Note that the previousbibnum_fr_all
was 553MB, not 352KB!The text was updated successfully, but these errors were encountered: