The Contest BotScraper is a web crawling, used to crawl Brazilian contests websites and extract structured data from their pages.
Check the BotScraper homepage at BotScraper for more information, including a list of features.
- Python 3.7+
- Tested on Linux (Ubuntu).
The quick way:
$ git clone https://github.com/jefersonjlima/botscraper.git
$ cd botscraper
$ echo 'export TELEGRAM_TOKEN="YOUR_ACCESS_TOKEN"' >> ~/.bashrc
$ make prepare-dev
$ make run
First of all, if you want to use the botscraper with telegram, you will need an Access Token. The lazy way to generate a Telegram Token is you have to talk to @BotFather and follow a few simple steps to generate it.
Edit the config/configs.cfg
to change the configurations.
For example, if you want to add or remove some keyword you need to change GERAL.keywords
(you can do it by the telegram commands too). If you are looking for an internship, just change the url_base
to https://www.pciconcursos.com.br/estagios/
.
In TELEGRAM.etl_schedule
you will define the UTC time to start the ETL.
/help
to see all commands./set <keyword>
to add new keyword./unset <keyword>
to remove a keyword./notify
to enable notification./non_notify
to disable notification./keywords
to show all keywords./show <keyword>
to list contests by keyword./show
to show all filtered contests.
Documentation is available online at BotScraper and in the docs
directory.