Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why thecodelesscode.com is not retrieving /topics and /contents and /names links on homepage #355

Open
benoit74 opened this issue Jul 23, 2024 · 0 comments
Labels
bug question scraping_issue Issue occured while using the scraper
Milestone

Comments

@benoit74
Copy link
Collaborator

Recipe: https://farm.openzim.org/recipes/thecodelesscode.com_en_all

Bug: the crawler does not find the /topics and /contents and /names links on homepage (and maybe others).

Tests done:

  • passing a mobile device in landscape (Pixel 2 landscape)
  • passing a bigger mobile device in landscape (Pixel 5 landscape)
  • passing no mobile device

All without success, only 5 links to various cases are found on the homepage

Command used in first tests:

docker run -v $PWD/output:/output --name crawlme --rm  webrecorder/browsertrix-crawler:1.2.4 crawl --failOnFailedSeed --behaviors "autoplay,autofetch,autoscroll" --url "http://thecodelesscode.com" --mobileDevice "Pixel 2" --cwd /output --combineWARC --depth 1
@benoit74 benoit74 added the scraping_issue Issue occured while using the scraper label Mar 10, 2025
@benoit74 benoit74 added this to the later milestone Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug question scraping_issue Issue occured while using the scraper
Projects
None yet
Development

No branches or pull requests

1 participant