Maths in Life Sciences Seminar - Semi-supervised multi-view Bayesian clustering for integrative genomics
|Starts:||14:00 8 Oct 2018|
|Ends:||15:00 8 Oct 2018|
|What is it:||Seminar|
|Organiser:||Faculty of Biology, Medicine and Health|
Details of the next instalment in the Maths in the Life Sciences seminar series are given below.
Semi-supervised multi-view Bayesian clustering for integrative genomics - delivered by Paul Kirk (University of Cambridge)
Although the challenges presented by high dimensional data in the context of regression are well-known and the subject of much current research, comparatively little work has been done on this in the context of clustering. In this setting, the key challenge is that often only a small subset of the covariates provides a relevant stratification of the population. Identifying relevant strata can be particularly challenging when dealing with high-dimensional datasets, in which there may be many covariates that provide no information whatsoever about population structure, or – perhaps worse – in which there may be (potentially large) covariate subsets that define irrelevant stratifications. For example, when dealing with genetic data, there may be some genetic variants that allow us to group patients in terms of disease risk, but others that would provide completely irrelevant stratifications (e.g. which would group patients together on the basis of eye or hair colour). Bayesian profile regression is a semi-supervised model-based clustering approach that makes use of a response in order to guide the clustering toward relevant stratifications. Here we consider how this approach can be extended to the "multiview" setting, in which different groups of covariates ("views") define different stratifications. We present some results in the context of breast cancer subtyping to illustrate how the approach can be used to perform integrative clustering of multiple 'omics datasets.
Organisation: University of Cambridge
Travel and Contact Information