IMDB Spider-Man Movies User Reviews EDA/LDA

Data Acquisition

The data was collected with a custom Selenium Webdriver web scraper for IMDb reviews which could be found here.

Data Cleaning

The initial cleaning was done in Python Jupyter Notebook environment and then the cleaned data was stored in a .csv file. Further cleaning and analysis were done with R and the steps could be followed on Table of Contents below.

The rendered notebook is on Kaggle and it can be accessed from here.

Data Analysis

I analyzed Spider-Man movie reviews from IMDb. As well as text analysis I also analyzed review related variables such as Review Rating, Review Helpfulness and Review Date.

In text analysis I used basic Natural Language Processing techniques such as:

Counts and Frequency of the Words in a review
TF-IDF Analysis
Sentiment Analysis
Topic Modelling with Latent Dirichlet Allocation (LDA)

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
EDA.Rmd		EDA.Rmd
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IMDB Spider-Man Movies User Reviews EDA/LDA

Data Acquisition

Data Cleaning

Data Analysis

Table of Contents

About

Releases

Packages

Languages

Okancan-Balci/IMDB_Spider-Man_Text_Analysis

Folders and files

Latest commit

History

Repository files navigation

IMDB Spider-Man Movies User Reviews EDA/LDA

Data Acquisition

Data Cleaning

Data Analysis

Table of Contents

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages