Skip to content
This repository has been archived by the owner on Feb 29, 2024. It is now read-only.

Latest commit

 

History

History
86 lines (61 loc) · 2.32 KB

File metadata and controls

86 lines (61 loc) · 2.32 KB

Web Scraper for Midwifery Today Conferences

This project contains a Python-based web scraper that extracts data about past conferences from Midwifery Today. The extracted data includes the place, title, and time of each conference, which is then saved to a CSV file.

Features

  • Uses Selenium to navigate to the "Past Conferences" page.
  • Extracts conference data using BeautifulSoup.
  • Filters out irrelevant data and entries.
  • Saves the extracted data to a CSV file.

Setting up a Virtual Environment

It's recommended to set up a virtual environment within the cloned project folder. This ensures that dependencies required by this project do not interfere with packages globally installed on your system.

Creating a Virtual Environment

You can set up a virtual environment using the following steps:

  1. First, clone the repository to your local machine:
    git clone [repository-link]
  2. Navigate to the cloned project directory:
     cd webscraper-midwife
  3. Make sure you have Python's venv module installed. If not, you can install it using:
    pip install virtualenv
  4. Within the project directory, create the virtual environment:
    python -m venv .
  5. Activate the virtual environment:

    On Windows:

    .\venv\Scripts\activate

    On macOS and Linux:

    source venv/bin/activate
  6. Once activated, you'll see (venv) in the terminal prompt. This indicates that the virtual environment is active. Now, you can install the project dependencies:
    pip install -r requirements.txt

Deactivating the Virtual Environment

When you're done working on the project, you can deactivate the virtual environment by simply typing:

deactivate

Requirements

  • Python 3
  • Selenium
  • BeautifulSoup
  • Requests

You can install the required packages using pip as mentioned in the step above:

pip install -r requirements.txt

Usage

python main.py

After execution, the extracted data will be saved as conferences.csv in the data directory.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT