Managing your Research Data: Best practices in Research Data Management for Biological Sciences

When and where?

Tuesday 2nd May 2023 10:00-16:30
ONLINE using Craik-Marshall ZOOM environment

Outline

It has been said that 80% of data analysis is spent on the process of cleaning and preparing the data. Not only does this represent a significant time investment for the data analyst, but is often a hurdle for the non-specialist trying to get to grips with analysing their own data after attending an R or Python course. Despite the best intentions, a spreadsheet that is intuitive and easily-understandable by human eyes can lead to disaster when trying to process computationally.

This workshop will go through the basic principles that we can all adopt in order to work with data more effectively and “think like a computer”. Moreover, we will discuss the best practices for data management and organisation so that our research is auditable and reproducible by ourselves, and others, in the future. Part of the journey will be via critical evaluation of example Data Management Plans (Often a condition of Grant).

Description

Do you know what a Data Management Plan is and what it covers?
How much data would you lose if your laptop was stolen?
Have you ever emailed your colleague a file named 'final_final_versionEDITED'?
Have you ever struggled to import your spreadsheets into R?

As a researcher, you will encounter research data in many forms, ranging from measurements, numbers and images to documents and publications. Whether you create, receive or collect data, you will certainly need to organise it at some stage of your project. This workshop will provide an overview of some basic principles on how we can work with data more effectively. We will discuss the best practices for research data management and organisation so that our research is auditable and reproducible by ourselves, and others, in the future.

Aims: During this course you will learn about:

What Research Funders expect
Options for backing up your computer
Ideas for naming and organising your files
Strategies for exchanging files with collaborators
Tips and tricks to make sure that your spreadsheets are readable by programming languages such as R
Learn how to use the OpenRefine software for data cleaning
Preparing high-throughput biological data for submission to a public repository

Objectives: After this course you should be able to:

Select an appropriate backup strategy for your data
Organise your files in a more structured and consistent manner
Avoid common pitfalls in spreadsheet manipulation
Known what resources are available at The University of Cambridge for Research Data Management

Timetable

Trainers.
Abigail Edwards (CRUK Cambridge Institute).
Ashley Sawle (CRUK Cambridge Institute).

	Timetable
10:00 - 10:20	Introduction, Data Management Plans (Ash)
10:20 - 11:00	Data formatting (Ash)
11:00 - 11:10	Break
11:10 - 12:00	OpenRefine practical (Abbi)
12:00 - 12:15	Spreadsheet validation practical (Abbi)
12:15 - 13:00	File management (Ash)
13:00 - 13:45	Lunch break
13:45 - 14:15	File management in DMP practical (Ash)
14:15 - 15:00	Data Sharing & Backup (Abbi)
15:00 - 15:10	Break
15:10 - 15:45	Data Sharing & Backup in DMP practical (Abbi)
15:45 - 16:00	Wrap-up & close

Please fill in the feedback survey at end of course link

Data for OpenRefine practical

patient_data.txt

Data Management Plans for practicals

Drosophila BBSRC project.
Signalling pathways MRC project.
Bioinformatics software BBSRC project.
Pathways to violence & crime ESRC project.
scRNAseq analysis of neurons.

Useful checklist: A Data management plan checklist.

Further viewing

Introduction to OpenRefine video tutorial
Keith Beggeley talk on how cut and paste errors could endanger patients
Florian's talk on why work reproducibly
Fun look at why organisation matters - Bad Project (Lady Gaga parody)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

index.md

index.md

Managing your Research Data: Best practices in Research Data Management for Biological Sciences

When and where?

Outline

Description

Aims: During this course you will learn about:

Objectives: After this course you should be able to:

Timetable

Data for OpenRefine practical

Data Management Plans for practicals

Further reading

Further viewing

Files

index.md

Latest commit

History

index.md

File metadata and controls

Managing your Research Data: Best practices in Research Data Management for Biological Sciences

When and where?

Outline

Description

Aims: During this course you will learn about:

Objectives: After this course you should be able to:

Timetable

Data for OpenRefine practical

Data Management Plans for practicals

Further reading

Further viewing