Skip to content

Sentiment Analysis on movie review data set using NLTK, Sci-Kit learner and some of the Weka classifiers

Notifications You must be signed in to change notification settings

bjprogrammer/Kaggle-Movie-Review

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Kaggle-Movie-Review

Sentiment Analysis on movie review data set using NLTK, Sci-Kit learner and some of the Weka classifiers

Goal- To predict the sentiments of reviews using basic classification algorithms and compare the results by varying different parameters.

Dataset-The data was taken from the original Pang and Lee movie review corpus based on reviews from the Rotten Tomatoes web site and later also used in a Kaggle competition.train.tsv contains the phrases and their associated sentiment labels. test.tsv contains just phrases

Features sets Used-Unigram feature(Bag of words), Bigram, Negation, POS(Parts of Speech) and also features based on sentiment lexicons such as LIWC,opinion lexicon and subjectivity(SL) lexicon

NLTK based Classifiers algorithms-Naive Bayes, Generalized Iterative Scaling , Improved Iterative Scaling algorithms

SciKit Learner CLassifiers- Random Forest,MultinomialNB, BernoulliNB, Logistic Regressions, SGDClassifer, SVC, Linear SVC, NuSVC, Decision Tree Classifier

Weka Classifiers-Naive Bayes, Random Forest

About

Sentiment Analysis on movie review data set using NLTK, Sci-Kit learner and some of the Weka classifiers

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages