Kaggle-Movie-Review

Sentiment Analysis on movie review data set using NLTK, Sci-Kit learner and some of the Weka classifiers

Goal- To predict the sentiments of reviews using basic classification algorithms and compare the results by varying different parameters.

Dataset-The data was taken from the original Pang and Lee movie review corpus based on reviews from the Rotten Tomatoes web site and later also used in a Kaggle competition.train.tsv contains the phrases and their associated sentiment labels. test.tsv contains just phrases

Features sets Used-Unigram feature(Bag of words), Bigram, Negation, POS(Parts of Speech) and also features based on sentiment lexicons such as LIWC,opinion lexicon and subjectivity(SL) lexicon

NLTK based Classifiers algorithms-Naive Bayes, Generalized Iterative Scaling , Improved Iterative Scaling algorithms

SciKit Learner CLassifiers- Random Forest,MultinomialNB, BernoulliNB, Logistic Regressions, SGDClassifer, SVC, Linear SVC, NuSVC, Decision Tree Classifier

Weka Classifiers-Naive Bayes, Random Forest

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
kagglemoviereviews		kagglemoviereviews
Final Project Report.pdf		Final Project Report.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kaggle-Movie-Review

About

Releases

Packages

Languages

bjprogrammer/Kaggle-Movie-Review

Folders and files

Latest commit

History

Repository files navigation

Kaggle-Movie-Review

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages