Due Date: 2/9/2015
In the data
folder you'll find included the file bikeshare.csv
. This file contains data related to number of riders (casual, members, and total) for each hour, compared to stats for that time, such as temperature, windspeed. the file bikeshare.txt
will explain it in more detail.
Your goals:
- Explore the data in both hourly and daily counts. You'll need to aggregate by day to generate the daily data.
- Visualise the relationships between the ridership and different features.
- Explain which features seem to be the strongest indicators for each type of ridership (casual and noncasual). Do certain features come off as better tells for one over the other?
- Summarise your results.
- Extra: Business application. Given this information, what suggestions could be made to improve the ridership program? Consider this open field, since we only have aggregated stats and not individual ridership here.
- A notebook that goes through the data exploration model;
- acquires and imports the data
- cleans it up (create the daily from the hourly)
- explores the two data sets for patterns
- summarises and explains the results
Please upload this notebook to your github (the .ipynb
file), or copy and paste the file's contents into a gist.
- Cleanliness: how approachable is this notebook? Does it feel organized?
- Visualisations vs explaination: how accurately can you explain the visuals you generate? How well do they capture the story about the data?