A Python-based tool for fetching, storing, and managing Wikipedia articles locally. This project provides both CLI and web interfaces for interacting with Wikipedia content, with features for caching, tracking changes, and managing the local article database.
- Fetch and cache Wikipedia articles locally
- Track article update history and access patterns
- Convert Wikipedia markup to Markdown
- Web interface for browsing cached articles
- CLI tools for database management
- Backup and restore functionality
- Automatic article refresh for outdated content
- Clone the repository:
git clone <repository-url>
cd wiki-tools
- Install dependencies using
uv
(recommended) orpip
:
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -r requirements.txt
- Initialize the database:
python cli.py db-upgrade
# Fetch a single article
python cli.py get-wiki-entry "Article Title"
# Fetch an article and its related articles
python cli.py get-wiki-related "Article Title"
# List all stored articles
python cli.py list-entries
# View action logs
python cli.py show-logs
python cli.py show-logs --title "Article Title" --format detailed
# Refresh outdated articles
python cli.py refresh-all
python cli.py refresh-all --force # Refresh all regardless of age
# Backup database
python cli.py db-dump --output-dir my_backups
# Restore from backup
python cli.py db-restore \
--entries-file my_backups/wiki_entries_20240220_123456.json \
--logs-file my_backups/wiki_entry_logs_20240220_123456.json
- Start the web server:
uvicorn main:app --reload
- Open
http://localhost:8000
in your browser
The web interface provides:
- List of cached articles
- Article viewing with Markdown rendering
- Article fetching interface
- Last modified timestamps
The project uses SQLite with SQLAlchemy and Alembic for database management.
# Apply migrations
python cli.py db-upgrade
# Revert last migration
python cli.py db-downgrade
# Create a backup
python cli.py db-dump
# Restore from backup
python cli.py db-restore --entries-file <path> --logs-file <path>
wiki-tools/
├── alembic/ # Database migration scripts
├── web/ # Web interface files
│ ├── static/ # Static assets
│ └── templates/ # HTML templates
├── wiki_tools/ # Core package
│ ├── models.py # Database models
│ ├── database.py # Database configuration
│ └── lib.py # Core functionality
├── cli.py # Command line interface
├── main.py # FastAPI web application
└── config.py # Configuration settings
Configuration is managed through environment variables or a .env
file:
DATABASE_URL
: SQLite database path (default:sqlite:///./newmexico.db
)WIKIPEDIA_BASE_URL
: Wikipedia API endpointITEMS_PER_USER
: Pagination limit
- Create a virtual environment:
uv venv
source .venv/bin/activate
- Install development dependencies:
uv pip install -e ".[dev]"
- Run tests:
pytest
[Add your license information here]
[Add contribution guidelines here]