Data analysis is a fundamental tool in quantitative research, but when faced with a big, messy dataset it can be difficult to know how to get started.
In this workshop, we will look at some basic techniques for getting data into a usable format and exploring it visually in order to formulate suitable questions and find the answers. We will also explore some of the principles behind good data visualisation practice when communicating results to others in reports or presentations.
The workshop will include a lot of hands-on practice using the Orange data science environment. No experience of programming is required.
The following topics will be covered:
- What is exploratory data analysis?
- Getting data into a usable format.
- Visualising distributions.
- Dealing with outliers and missing values.
- Exploring variation and covariation.
- Graphics for communication.
You will need to have the Orange 3 data science environment installed and working on your computer.
The Anaconda data science platform includes Orange 3, and may be the most convenient way to install:
Alternatively, you can install Orange 3 as a stand-alone application:
To check that everything is working properly, you can follow the steps in this first tutorial video: Getting Started with Orange 01: Welcome to Orange - YouTube
You will have the chance to work with your own data set during the course. If you have a suitable spreadsheet of data related to your research (in Excel or CSV format), please have it ready.
Don't worry if you don't have a suitable data set to hand, as other example data sets will be provided.
The solutions folder contains an example of an Orange workflow that can reproduce the data analysis described in the slides.
After opening this workflow in Orange, you will need to locate the appropriate .csv
files by double-clicking on each CSV file import widget.
Your feedback is very important to the Graduate School as we are continually trying to improve the training we offer.
At the end of the course, please help us by completing the evaluation form at http://bit.ly/rcds2021
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.