The first project for the Udacity Data Science nanodegree using 2017 Stack Overflow survey data.
I used the standard python packages for data manipulation and visualization: numpy, pandas, matplotlib and seaborn.
Python version: 3.7.4
This project is used to practice data science skills, to familiarize with the data science process, and to fulfill the requirement of the Udacity data science nanodegree.
I specifically ask the following three questions:
-Question 1: Is there a gender gap in terms of salary for developers?
-Question 2: Which developer types have the highest career satisfaction?
-Question 3: What kind of activities developers most frequently do on Stack Overflow?
I did this analysis on Jupyter Notebook and the details are included in the file
"DS Nanodegree project 1 - SO 2017 Survey Analysis revised.ipynb".
The data files are from Stack Overflow Developer Survey 2017 and can be found here
-Question 1: Is there a gender gap in terms of salary for developers?
Female developers earn slightly a bit more than male developers (58083 vs. 56996), but this could be biased due to the fact that the majority of respondents didn't report salary. Females also reported a lower expected salary, so they might be at a disadvantage because they expect less salary to start with.
-Question 2: Which developer types have the highest career satisfaction?
Generally developers have pretty good career satisfaction, with not much difference between different types of developers.
-Question 3: What kind of activities developers most frequently do on Stack Overflow?
The most popular activities are finding an answer that solves a coding problem and copying code.
I wrote a related blog post here to summarize the findings.
Thanks to Stack Overflow for the dataset.