Skip to content
This repository has been archived by the owner on Feb 29, 2024. It is now read-only.

Web scraper for extracting past conference details from Midwifery Today. Outputs scraped data to a CSV file.

License

Notifications You must be signed in to change notification settings

SQZ0111/midwife-past-conferences-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web Scraper for Midwifery Today Conferences

This project contains a Python-based web scraper that extracts data about past conferences from Midwifery Today. The extracted data includes the place, title, and time of each conference, which is then saved to a CSV file.

Features

  • Uses Selenium to navigate to the "Past Conferences" page.
  • Extracts conference data using BeautifulSoup.
  • Filters out irrelevant data and entries.
  • Saves the extracted data to a CSV file.

Setting up a Virtual Environment

It's recommended to set up a virtual environment within the cloned project folder. This ensures that dependencies required by this project do not interfere with packages globally installed on your system.

Creating a Virtual Environment

You can set up a virtual environment using the following steps:

  1. First, clone the repository to your local machine:
    git clone [repository-link]
  2. Navigate to the cloned project directory:
     cd webscraper-midwife
  3. Make sure you have Python's venv module installed. If not, you can install it using:
    pip install virtualenv
  4. Within the project directory, create the virtual environment:
    python -m venv .
  5. Activate the virtual environment:

    On Windows:

    .\venv\Scripts\activate

    On macOS and Linux:

    source venv/bin/activate
  6. Once activated, you'll see (venv) in the terminal prompt. This indicates that the virtual environment is active. Now, you can install the project dependencies:
    pip install -r requirements.txt

Deactivating the Virtual Environment

When you're done working on the project, you can deactivate the virtual environment by simply typing:

deactivate

Requirements

  • Python 3
  • Selenium
  • BeautifulSoup
  • Requests

You can install the required packages using pip as mentioned in the step above:

pip install -r requirements.txt

Usage

python main.py

After execution, the extracted data will be saved as conferences.csv in the data directory.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT

About

Web scraper for extracting past conference details from Midwifery Today. Outputs scraped data to a CSV file.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages