Skip to content

Latest commit

 

History

History
26 lines (20 loc) · 1.57 KB

02-data_manipulation_and_visualization.md

File metadata and controls

26 lines (20 loc) · 1.57 KB

Data Manipulation and Visualization

Due Date: 2/9/2015

Description

In the data folder you'll find included the file bikeshare.csv. This file contains data related to number of riders (casual, members, and total) for each hour, compared to stats for that time, such as temperature, windspeed. the file bikeshare.txt will explain it in more detail.

Your goals:

  • Explore the data in both hourly and daily counts. You'll need to aggregate by day to generate the daily data.
  • Visualise the relationships between the ridership and different features.
  • Explain which features seem to be the strongest indicators for each type of ridership (casual and noncasual). Do certain features come off as better tells for one over the other?
  • Summarise your results.
  • Extra: Business application. Given this information, what suggestions could be made to improve the ridership program? Consider this open field, since we only have aggregated stats and not individual ridership here.

What's due

  • A notebook that goes through the data exploration model;
    • acquires and imports the data
    • cleans it up (create the daily from the hourly)
    • explores the two data sets for patterns
    • summarises and explains the results

Please upload this notebook to your github (the .ipynb file), or copy and paste the file's contents into a gist.

What we're measuring

  • Cleanliness: how approachable is this notebook? Does it feel organized?
  • Visualisations vs explaination: how accurately can you explain the visuals you generate? How well do they capture the story about the data?