Find out the problems with the community and why it's hard to find a host
Current status, membership length, age, languages, hometown, profile completion, education, lots of unstructured profile information, home/couch details, references, friends.)
Private messages, request numbers and acceptances/denials.
How many members are freeloaders? What traits correspond to reviews? How often is a bad review the last event? What proportion of reviews are bad? Where are the most frequent hosts located and where are the most frequent travelers from?
- Scrape each city's host results with scrapeSearchPages.py
- Scrape total number of hosts in each city with scrapeSearchPages.py (quick, will be merged with scrapeSearchPages.py)
- Scrape individuals' details from user profiles with Scrapy (in 'sofariders' directory, run "scrapy crawl sofariders_spider")
- Load, merge, clean and analyze data with SofaRidersEDA.ipynb