We solve computational challenges in biomedical research that require statistical or algorithmic innovations. Much of our work is aimed at unlocking the full power of biobank-scale genetic data sets now becoming available (e.g., N=500,000 UK Biobank). To this end, we develop high-performance open source software resources freely available to the scientific community.
Genome-wide association analysis
Over the past decade, ever-larger genome-wide association studies (GWAS) have yielded rich insights into the genetic architectures of human complex traits. Linear mixed models (LMM) have become the method of choice for association analyses and have also proven versatile for partitioning heritability across the genome and performing phenotype prediction. We develop scalable LMM algorithms that detect and harness subtle statistical signals now becoming visible in very large data sets.
Haplotype phasing and imputation
The paradigm of phasing and imputation has emerged as the optimal means of maximizing statistical GWAS power within a given budget. In this paradigm, a large number of samples (the GWAS cohort) are genotyped at a small subset of genomic variants, and their genotypes at remaining variants are then statistically imputed using a whole-genome sequenced reference panel. This pipeline now enjoys near-universal use in GWAS, but as sample sizes have grown, computation has become a critical challenge – and opportunity for methodological innovation.
Somatic chromosomal aberrations
A particularly exciting application of phasing methodology is the study of somatic clonal expansions. Somatic mutation is the process by which cells in the body undergo DNA alterations. In some cases – most notably, cancers – mutant cells undergo clonal expansion, proliferating to high frequency. We have found that accurate statistical phasing enables highly sensitive detection of somatic structural aberrations, revealing insights into the biology of clonal expansion and providing the potential for early detection of pre-cancerous mutations.