Computational Science using Big Data in R
Dates: | 12 December 2016 |
Times: | All day |
What is it: | Short course |
Organiser: | Cathie Marsh Institute for Social Research |
How much: | £195 (Full fee)
/ £140 £140 for those from educational and charitable institutions |
Who is it for: | University staff, Adults, Current University students, General public |
|
This course will introduce a workflow for working efficiently with large amounts of data in R, using data from the Human Mortality Database (HMD) and Human Fertility Database (HFD). Using both of these large databases in an extended case study, the course will show how the R packages plyr and purrr can be used to automate and speed up all stages of the quantitative social science workflow, from tidying and loading data from multiple sources, to producing dozens of separate analyses and data visualisations through a single chunk of code.
While working through the extended case study, related packages, processes and patterns for working with large-scale and complex data efficiently will be introduced, including packages like stringr, tidyr and dplyr for data management, and ‘piped coding’ approaches for making R code more ‘literate’: easier to write, understand and reason about.
If you use the HMD and HFD, the code presented will likely be useful right away for your work. Even if you do not, the general patterns, concepts and methods introduced through the case study will help you think about how to manage large amounts of data and automate your own data workflows.
Price: £195 (Full fee)
Concessions: £140 £140 for those from educational and charitable institutions
Travel and Contact Information
Find event
Humanities Bridgeford Street Building