A complete-auto IP pool, provider IP location visualization.
并行网络爬虫IP池自动筛选系统
This project provide a web API for auto Proxy Pool, automatically crawl IP proxy from xici.com, 66ip.cn, kuaidaili.com and some GitHub opensource ip proxy project. Once you install this project on your Linux server, you can get a dynamic proxy from your server port 5555.
Random select a Available IP from database.
Find Accurate IP location
- Clone this project
git clone https://github.com/KylinC/SmartProxyPool.git
- install redis and other dependent packages (sever)
// install redis
sudo apt-get install redis-server
//install packages
cd SmartProxy
pip install -r requirements.txt
- Change the redis' password to yours (SmartProxy/SmartProxy/config.py)
# Scheduler Switch
TESTER_ENABLED = True
GETTER_ENABLED = True
API_ENABLED = True
# Check Loop
TESTER_CYCLE = 20
# CRAWLER Loop
GETTER_CYCLE = 300
# API Configuration
API_HOST = '127.0.0.1'
API_PORT = 5000 ##### Flask Configuration
# Aim Website
TEST_URL = 'http://www.baidu.com' ##### Aim Website you want to crawler
# Redis Database locstion
REDIS_HOST = '127.0.0.1' ###### Support Your Remote Redis Database
# Redis port
REDIS_PORT = 6379
# Redis password,default = None
REDIS_PASSWORD = "YourPassword" ##### Change to Your Redis Password
REDIS_KEY = 'proxies'
# selection initial ruler
MAX_SCORE = 100
MIN_SCORE = 0
INITIAL_SCORE = 10
# selection parameter
RANDOM_INTERVAL = [90, 100]
# redis mox data number
POOL_UPPER_THRESHOLD = 50000
# accept response code
VALID_STATUS_CODES = [200, 302]
# ip test batch
BATCH_TEST_SIZE = 10
# baiduMap AK
AK = "YourAK" #### Change to your Baidu Map API AK
- turn on your redis server(refers to SmartProxy/instructions)
- Run it on your server
python run.py
View in Browser(recommend: Google Chrome) at http://YourServerIP:YourPort/
Access in WebSpider Programm at http://YourServerIP:YourPort/random
According to Python3WebSpider written by Qingcai Cui.