The following project has been created as part of the Udacity Data Science Nanodegree https://www.udacity.com/course/data-scientist-nanodegree--nd025
The project involved gathering the data, dealing with missing values and categorical variables and analysis to answer 3 key questions:
1. Do more expensive properties tend to have higher reviews scores?
2. Do properties with a flexible cancellation policy have higher review scores??
3. Do properties with higher reviews more frequently booked up?
The following project looks at data in the Seattle AirBnB dataset https://www.kaggle.com/airbnb/seattle. AirBnB_Project.ipynb - Jupyter notebook with full project analysis including data cleaning and results calendar.csv - dataset including AirBnB listing availability and prices listings.csv - dataset including information on AirBnB listings such as cancellation policy and review score reviews.csv - dataset including review information on AirBnB listings
A summary of results can be found in Medium artile: https://medium.com/@phoebe.macdonald/how-can-you-increase-your-airbnb-score-and-why-b1e7a011d95 and within the Jupyter notebook.
1. Do more expensive properties tend to have higher reviews scores?
There is a weak relationship between price and review score. It is unusual for expensive properties to have low review scores but high review scores are obtainable for cheaper properties.
2. Do properties with a flexible cancellation policy have higher review scores?
There is a weak relationship between cancellation booking flexibility and review score with more flexible policy properties having a marginally higher review score than those with moderate policies and (to a greater extent) strict policies.
3. Do properties with higher reviews more frequently booked up?
No, there appears to be no relationship between review score and availability.