Ratings prediction using NLP ⭐

This project's aim is to predict the ratings using the users reviews in the spanish Amazon Review Corpus.

-> Project status: [ Completed ]

Project description

The importance of customer satisfaction is that it helps us to know the likelihood of a customer making a purchase in the future. Asking customers to rate the degree of satisfaction is a good way to see if they will become regular customers or even brand advocates.

Objective: Create a model of Machine Learning to analyse and predict the number of stars of any review based only on the text of it.

Methods used

Descriptive statistics
Data visualization
Corpus preprocessing
Feature engineering
Sentiment Analysis
Machine learning

Technologies

Python
Numpy, Pandas, Scipy
NLTK, Spacy
Matplotlib, Seaborn
Scikit Learn: Naive Bayes, Linear SVC, etc.

Results

(Back to top)

With the modeling part we were able to obtain an accuracy of 0.55 in test, which is not a particularly high performance. The model that had the best performance was Linear SVC for a very short time. With the analysis of the coefficients we noticed that the most important features for this model are the ones related to the polarity and some of the corpus itself: perfect, good, good, great, bad, not good, meet - not meet, return, etc.

Next steps

(Back to top)

To improve the current project it would be convenient to reform the problem, since even if other machine learning algorithms are tested the result will not change much.

If you want to classify reviews of people who feel satisfied with a product in contrast to people to whom the product did not meet their expectations, it would be best to simplify the problem to a binary classification. That is, the model should predict whether the comment is positive or negative. And this would provide more information about the overall quality or satisfaction of the product.

Contact

(Back to top)

You can visit my Personal Website, follow me on Twitter, connect with me on LinkedIn, or check out the rest of my projects on my GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
data		data
images		images
processed_data		processed_data
.gitattributes		.gitattributes
.gitignore		.gitignore
Demo.pptx		Demo.pptx
LICENSE		LICENSE
P3_amazon.ipynb		P3_amazon.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ratings prediction using NLP ⭐

Table of contents

Project description

Methods used

Technologies

Results

Next steps

Contact

About

Languages

License

dewith/user_ratings

Folders and files

Latest commit

History

Repository files navigation

Ratings prediction using NLP ⭐

Table of contents

Project description

Methods used

Technologies

Results

Next steps

Contact

About

Topics

Resources

License

Stars

Watchers

Forks

Languages