A 2 day course, 5th & 6th March 2018.
This module deals with the ubiquitous and often neglected problem of dealing with missing data, common in many types of statistical analysis. We survey some ad-hoc strategies to deal with them and show how the can lead to bias and inefficiencies. We advocate using a principled approach and the formulating of the inherent missing data mechanism. We look at several principled methods of dealing with missing data. First we present a fully Bayesian approach using Winbugs. Secondly we create multiply imputed datasets using chained equation and then apply Rubin’s rules for combing the analyses of the models. We then do the same thing as the previous method but use multivariate techniques rather than chained equations as the method of multiple imputation. Finally we look at examples where no imputation is needed at all. All of the methods will be illustrated through good examples using the appropriate tools for exploration and diagnostics. We will also touch on models for imputation for hierarchical models when a mixed effects.
Important please note:
If the course is fully booked please do complete the registration, as then you will be placed on a waiting list. You will be allocated a place if more places become available or if people cancel.
We will make every attempt to accommodate Lancaster University staff and postgraduate research students on our courses. However, if a course becomes fully booked we reserve the right to give priority to students on the MSc in Statistics, MSc in Data Science, and external participants.
Details of course fees.
Registrations are transferable to another course or individual at any time. Full refunds will be given for cancellation 10 or more working days before the course start date. Otherwise the full course fee will be charged.
Monday, March 5, 2018
Postgraduate Statistics Centre
Dr Gareth Ridall
Taught Session - For Externals, Staff & Students
Topics covered will include:
- The missing data mechanisms : Illustration using directed graphical models and exploration of the missingness models using appropriate software.
- A survey of Ad-Hoc methods illustrating their drawbacks.
- Missing data in the covariates or explanatory variables.
- Full Bayesian imputation using WinBugs to demonstrate the role of the three models ( the model for missingness, the imputation model and the substantive model).
- Multiple imputation using chained equations and multivariate methods
- Rubin’s rules for combining the modelling of multiply imputed datasets
- Diagnostics of the imputation process.
- A survey of methods of dealing with missingness in hierarchical datasets.
On successful completion students will be able to:
- To demonstrate mastery of tools for exploring the missingness patterns using VIM and mice software libraries for R
- To formulate a possible missing data mechanism, for a given scenario, and to identify cases where the missing data mechanism is ignorable
- To formulate and differentiate: the model for missingness, the imputation model and the substantive model (model of interest)
- To be able to differentiate between sampling and parameter uncertainty and to recognise that the predictive distribution of the missing data incorporates both types of uncertainty
- To implement some naive methods for dealing with missingness (such as single imputation or list wise deletion), to recognise the limitations of each methods and identify situations where their use may be appropriate
- To be able to explain the differences between a multivariate imputation model and one using chained equations.
- To estimate the between imputation variability and the within imputation variability and to combine in a sensible way to estimate the total variability and the fraction of information lost through missingness
- Stef van Buuren, 2012 Flexible Imputation of Missing Data, (Chapman & Hall/CRC Interdisciplinary Statistics Series).
- James R. Carpenter and Michael G. Kenward , 2013. Multiple Imputation and Its Application (Statistics in Practice). Wiley.