Haplotype phasing

We develop computational tools to solve statistical and algorithmic challenges in quantitative genetics.

We are based in the Division of Genetics and Center for Data Sciences at Brigham and Women's Hospital / Harvard Medical School. We are affiliated with the Program in Medical and Population Genetics at the Broad Institute.

Our work is generously supported by a Burroughs Wellcome Fund Career Award at the Scientific Interfaces, a Glenn Foundation for Medical Research and AFAR Grant for Junior Faculty, a Broad Institute Next Generation Fund award, and startup funding from the Brigham and Women's Hospital Divisions of Genetics and Cardiovascular Medicine.

Latest News

UK Biobank clonal hematopoiesis paper published in Nature

July 11, 2018
Our work on mosaic chromosomal alterations in the UK Biobank N=150K interim release has been published in Nature! This study used long-range haplotype phasing information to detect mosaicism in blood at very low clonal fractions (down to ~1%), producing an atlas of 8,342 mosaic events. The statistical power of this data set revealed several rare inherited variants that strongly influence clonal expansions involving nearby chromosomal alterations and also refined the link between mosaicism and future blood cancers. [... Read more about UK Biobank clonal hematopoiesis paper published in Nature

Upcoming presentations on clonal hematopoiesis

January 26, 2018
Po-Ru Loh will be presenting work on mosaic CNV detection using long-range phasing at the Broad Institute MIA Seminar on Wed 2/21 (with Giulio Genovese), the MIT Bioinformatics Seminar on Wed 2/21, and the UCLA Computational Genomics Winter Institute on Mon 2/26.

ASHG 2017 Epstein Award

October 26, 2017
Po-Ru Loh's plenary talk at ASHG 2017 won an Epstein Trainee Award for Excellence in Human Genetics!

New preprint (Loh et al.): Mixed model association for biobank-scale data sets

September 27, 2017
Biobank-based genome-wide association studies are enabling exciting insights in complex trait genetics, but much uncertainty remains over best practices for optimizing statistical power and computational efficiency in GWAS while controlling confounders. Here, we introduce a much faster version of our BOLT-LMM Bayesian mixed model association method --- capable of running analyses of the full UK Biobank cohort in a few days on a single compute node --- and show that it produces highly powered, robust test statistics when run on all 459K European samples (retaining related individuals). When... Read more about New preprint (Loh et al.): Mixed model association for biobank-scale data sets
More

Recent Publications

A genome-wide cross-trait analysis from UK Biobank highlights the shared genetic architecture of asthma and allergic diseases

Zhu Z, Lee PH, Chaffin MD, Chung W, Loh P-R, Lu Q, Christiani DC, Liang L. A genome-wide cross-trait analysis from UK Biobank highlights the shared genetic architecture of asthma and allergic diseases. Nat Genet 2018;50(6):857-864.Abstract
Clinical and epidemiological data suggest that asthma and allergic diseases are associated and may share a common genetic etiology. We analyzed genome-wide SNP data for asthma and allergic diseases in 33,593 cases and 76,768 controls of European ancestry from UK Biobank. Two publicly available independent genome-wide association studies were used for replication. We have found a strong genome-wide genetic correlation between asthma and allergic diseases (r = 0.75, P = 6.84 × 10). Cross-trait analysis identified 38 genome-wide significant loci, including 7 novel shared loci. Computational analysis showed that shared genetic loci are enriched in immune/inflammatory systems and tissues with epithelium cells. Our work identifies common genetic architectures shared between asthma and allergy and will help to advance understanding of the molecular mechanisms underlying co-morbid asthma and allergic diseases.
Read more

Insights into clonal haematopoiesis from 8,342 mosaic chromosomal alterations

Loh P-R, Genovese G, Handsaker RE, Finucane HK, A Reshef Y, Palamara PF, Birmann BM, Talkowski ME, Bakhoum SF, McCarroll SA, Price AL. Insights into clonal haematopoiesis from 8,342 mosaic chromosomal alterations. Nature 2018;559(7714):350-355.Abstract
The selective pressures that shape clonal evolution in healthy individuals are largely unknown. Here we investigate 8,342 mosaic chromosomal alterations, from 50 kb to 249 Mb long, that we uncovered in blood-derived DNA from 151,202 UK Biobank participants using phase-based computational techniques (estimated false discovery rate, 6-9%). We found six loci at which inherited variants associated strongly with the acquisition of deletions or loss of heterozygosity in cis. At three such loci (MPL, TM2D3-TARSL2, and FRA10B), we identified a likely causal variant that acted with high penetrance (5-50%). Inherited alleles at one locus appeared to affect the probability of somatic mutation, and at three other loci to be objects of positive or negative clonal selection. Several specific mosaic chromosomal alterations were strongly associated with future haematological malignancies. Our results reveal a multitude of paths towards clonal expansions with a wide range of effects on human health.
Read more

Leveraging molecular quantitative trait loci to understand the genetic architecture of diseases and complex traits

Hormozdiari F, Gazal S, van de Geijn B, Finucane HK, Ju CJ-T, Loh P-R, Schoech A, Reshef Y, Liu X, O'Connor L, Gusev A, Eskin E, Price AL. Leveraging molecular quantitative trait loci to understand the genetic architecture of diseases and complex traits. Nat Genet 2018;50(7):1041-1047.Abstract
There is increasing evidence that many risk loci found using genome-wide association studies are molecular quantitative trait loci (QTLs). Here we introduce a new set of functional annotations based on causal posterior probabilities of fine-mapped molecular cis-QTLs, using data from the Genotype-Tissue Expression (GTEx) and BLUEPRINT consortia. We show that these annotations are more strongly enriched for heritability (5.84× for eQTLs; P = 1.19 × 10) across 41 diseases and complex traits than annotations containing all significant molecular QTLs (1.80× for expression (e)QTLs). eQTL annotations obtained by meta-analyzing all GTEx tissues generally performed best, whereas tissue-specific eQTL annotations produced stronger enrichments for blood- and brain-related diseases and traits. eQTL annotations restricted to loss-of-function intolerant genes were even more enriched for heritability (17.06×; P = 1.20 × 10). All molecular QTLs except splicing QTLs remained significantly enriched in joint analysis, indicating that each of these annotations is uniquely informative for disease and complex trait architectures.
Read more

Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types

Finucane HK, A Reshef Y, Anttila V, Slowikowski K, Gusev A, Byrnes A, Gazal S, Loh P-R, Lareau C, Shoresh N, Genovese G, Saunders A, Macosko E, Pollack S, Pollack S, Perry JRB, Buenrostro JD, Bernstein BE, Raychaudhuri S, McCarroll S, Neale BM, Price AL. Heritability enrichment of specifically expressed genes identifies disease-relevant tissues and cell types. Nat Genet 2018;50(4):621-629.Abstract
We introduce an approach to identify disease-relevant tissues and cell types by analyzing gene expression data together with genome-wide association study (GWAS) summary statistics. Our approach uses stratified linkage disequilibrium (LD) score regression to test whether disease heritability is enriched in regions surrounding genes with the highest specific expression in a given tissue. We applied our approach to gene expression data from several sources together with GWAS summary statistics for 48 diseases and traits (average N = 169,331) and found significant tissue-specific enrichments (false discovery rate (FDR) < 5%) for 34 traits. In our analysis of multiple tissues, we detected a broad range of enrichments that recapitulated known biology. In our brain-specific analysis, significant enrichments included an enrichment of inhibitory over excitatory neurons for bipolar disorder, and excitatory over inhibitory neurons for schizophrenia and body mass index. Our results demonstrate that our polygenic approach is a powerful way to leverage gene expression data for interpreting GWAS signals.
Read more

Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection

Gazal S, Finucane HK, Furlotte NA, Loh P-R, Palamara PF, Liu X, Schoech A, Bulik-Sullivan B, Neale BM, Gusev A, Price AL. Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat Genet 2017;49(10):1421-1427.Abstract
Recent work has hinted at the linkage disequilibrium (LD)-dependent architecture of human complex traits, where SNPs with low levels of LD (LLD) have larger per-SNP heritability. Here we analyzed summary statistics from 56 complex traits (average N = 101,401) by extending stratified LD score regression to continuous annotations. We determined that SNPs with low LLD have significantly larger per-SNP heritability and that roughly half of this effect can be explained by functional annotations negatively correlated with LLD, such as DNase I hypersensitivity sites (DHSs). The remaining signal is largely driven by our finding that more recent common variants tend to have lower LLD and to explain more heritability (P = 2.38 × 10-104); the youngest 20% of common SNPs explain 3.9 times more heritability than the oldest 20%, consistent with the action of negative selection. We also inferred jointly significant effects of other LD-related annotations and confirmed via forward simulations that they jointly predict deleterious effects.
Read more
More