This project aims to analyze customer reviews of British Airways from the website "Airline Quality". It utilizes web scraping techniques to extract text data, performs data preprocessing including text cleaning, tokenization, part-of-speech tagging, stopwords removal, and lemmatization. Finally, sentiment analysis is conducted using the VADER (Valence Aware Dictionary and sEntiment Reasoner) sentiment analysis tool, and the results are visualized through pie charts and word clouds.
Ensure you have Python installed along with the necessary libraries like requests
, BeautifulSoup
, pandas
, nltk
, matplotlib
, wordcloud
, and vaderSentiment
.
- Run the script to scrape the desired number of pages of reviews from the British Airways section on "Airline Quality".
- Preprocess the data including cleaning, tokenization, POS tagging, and lemmatization.
- Conduct sentiment analysis using the VADER sentiment analysis tool.
- Visualize the sentiment analysis results using pie charts and word clouds.
- Clone or download this repository.
- Open a terminal and navigate to the project directory.
- Run the script using
python script_name.py
. - Follow the prompts to input the desired number of pages to scrape.
- Python 3.x
- Libraries:
requests
,BeautifulSoup
,pandas
,nltk
,matplotlib
,wordcloud
,vaderSentiment
- Ensure a stable internet connection for successful web scraping.
- Adjust the
pages
variable to scrape more or fewer pages of reviews as needed. - Customize the visualization parameters to suit your preferences.
- Gabin H. VEGLO
#This project is licensed under the MIT License.#