Integrating and analysing multiple datasets

Dates:	12 September 2019 - 13 September 2019
Times:	All day
What is it:	Workshop
Organiser:	Cathie Marsh Institute for Social Research
How much:	£60/£120/£440
Who is it for:	University staff, External researchers, Adults, Current University students
Speaker:	Dr Ana Ivon Morales Gomez , Prof Mark Elliot

See travel and contact information

Add to your calendar

More information

Booking

Other events

This workshop will enable participants to:

• Produce data descriptions and summaries to understand the data. • Use statistical tools to clean and manipulate data • Integrate relational data • Identify and handle missing data • Visualise data and explore patterns • Improve their interdisciplinary team working skills

Course Description

This course, jointly organised by NCRM and the UK Data Service, will introduce participants to the complexities of analysing data from multiple sources. It will cover issues of data quality, cleaning, derivation and linkage.

The increasing availability of data on all aspects of modern life - whether such data be open, archived or proprietary - has started to open up the possibility of drawing on multiple datasets to solve analytical problems.

Getting to know the data available is a fundamental step in data analysis. Not only does it allow us to know what they contain, their scope and shape, but also provides insights about the quality, format and other potential issues that affect the usability of the data. This is especially important when working with data from different sources, where inconsistencies between the different sources are more prone to occur presenting problems with merging or linking the datasets together.

Day 1

The morning session will be focused on data cleaning and manipulation as an essential part of data analysis. In this session, we will learn how to identify the type of cleaning a particular data set needs in preparation for the data analysis. We will learn different techniques and practical tools to explore and manipulate the data with an emphasis on: checking the quality of the data, removing unnecessary data, creating new variables and dealing with potential errors and inconsistencies.

The afternoon session will be firstly devoted to discussing issues around missing data, with the goal of learning to identify missing data mechanisms and how different methods are applied to address missingness, depending on the underlying mechanism. Then, we will move on to discuss challenges around linking relational data and learn different methods to integrate data from different sources.

All sessions will include a mixture of presentations and hands-on practical activities. All the practical exercises will be done using R Studio. These practical sessions will give participants the opportunity to apply the main concepts discussed in the lectures to real-world data.

Day 2

Day 2 will focus on working in teams to produce an analysis requiring them to work on multiple datasets. At the end of the day each team will present their solution.

On completion of this workshop, participants will gain new skills to understand the challenges of using real-world data and to apply a range of data analysis tools to process, clean and transform data into a suitable format for data analysis. Participants will also learn how to work with multiple datasets and apply practical methods for handling missing data.

Introduction to R webinar (optional)

The course will be taught using R. For those with no prior experience of R, an introductory webinar will be available from the UK Data Service on Thursday 5th September from 3:00 PM - 4:00 PM. A private link to the webinar will be sent to all participants to register if you wish to attend.

Reading materials (not compulsory)

Wickham, H; Grolemund, G. 2016”R for Data Science” available online: https://r4ds.had.co.nz/

Price: £60/£120/£440

Speakers

Dr Ana Ivon Morales Gomez

Organisation: Universiyt of Manchester/UKDS

Prof Mark Elliot

Organisation: University of Manchester/NCRM

Travel and Contact Information

Find event

4.2
Roscoe Building
Manchester

Contact event

Claire Spencer

0161 275 4579

claire.spencer@manchester.ac.uk