Haplotype phasing

We develop computational tools to solve statistical and algorithmic challenges in quantitative genetics.

We are based in the Division of Genetics and Center for Data Sciences at Brigham and Women's Hospital / Harvard Medical School. We are affiliated with the Program in Medical and Population Genetics at the Broad Institute.

Our work is generously supported by an NIH Director's New Innovator Award, a Burroughs Wellcome Fund Career Award at the Scientific Interface, and a Broad Institute Next Generation Fund award, and we are grateful for past support from a Glenn Foundation for Medical Research and AFAR Grant for Junior Faculty and a Sloan Research Fellowship.

Latest News

Talk on haplotype-informed CNV analysis at ProbGen 2022

March 16, 2022
At the 2022 Probabilistic Modeling in Genomics (ProbGen) conference, Margaux Hujoel will be presenting her work on haplotype-informed CNV detection and subsequent association and fine-mapping analysis in UK Biobank: "Influences of rare copy number variation on human complex traits" (Mon Mar 28).

Po-Ru Loh receives 2022 ISCB Overton Prize

February 19, 2022
Po-Ru Loh has been awarded the International Society for Computational Biology's Overton Prize for outstanding accomplishment by an early to mid-career scientist in the field of computational biology. A big thank-you to all of the mentors, collaborators, and trainees who contributed to the work recognized by this award! Po-Ru will be accepting the award and presenting a keynote talk at the ISMB 2022 conference in July.

New preprint on phenotypes observed in carriers of recessive disease variants

December 14, 2021
We are excited to share a new preprint, "A spectrum of recessiveness among Mendelian disease variants in UK Biobank" (Barton et al.), which leverages whole-exome sequencing together with imputation in UK Biobank to identify carrier effects of rare variants known to cause recessive Mendelian diseases in homozygotes. These analyses identified 103 significant associations between quantitative traits and carrier status for 35 unique Mendelian recessive diseases, including a... Read more about New preprint on phenotypes observed in carriers of recessive disease variants

New preprint on phenotypic impacts of rare copy number variants

October 22, 2021
We are excited to share a new preprint, "Influences of rare copy number variation on human complex traits" (Hujoel et al.), which explores the phenotypic impact of rare copy number variation in the human genome, discovering many new ways in which genetic variation shapes human traits. These analyses were enabled by a new computational approach we developed that substantially increases CNV detection power in large cohorts by pooling information across individuals who... Read more about New preprint on phenotypic impacts of rare copy number variants

Protein-coding variable number tandem repeat (VNTR) paper published in Science

September 23, 2021
Ronen Mukamel and Bob Handsaker's paper on phenotypic effects of protein-coding variable-number-of-tandem repeat (VNTR) polymorphisms (Mukamel*, Handsaker* et al. 2021 Science) is now published -- congratulations, Ronen and Bob! This exciting collaboration with Steve McCarroll's lab found that some of the largest effects of common genetic variants on human phenotypes (including height, biomarkers of health, and hair morphology) arise... Read more about Protein-coding variable number tandem repeat (VNTR) paper published in Science

Three talks and a poster talk at ASHG 2021

August 23, 2021

We're very excited to share our ongoing work at ASHG this October! Alison Barton and Margaux Hujoel will present platform talks on penetrance of disease variants and CNV associations in UK Biobank, Maxwell Sherman will present a plenary talk on somatic mutations in cancer, and Ronen Mukamel will present a poster talk on dissecting Lp(a) genetics. Alison, Margaux, and Max all received semifinalist Charles J. Epstein Trainee Awards -- congratulations!

Alison Barton: "Incomplete penetrance of disease variants in the UK Biobank" (platform talk, Wed 10/20 at 11:15am)...

Read more about Three talks and a poster talk at ASHG 2021
More

Recent Publications

Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets

Márquez-Luna C, Gazal S, Loh P-R, Kim SS, Furlotte N, Auton A, Auton A, Price AL. Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets. Nat Commun 2021;12(1):6052.Abstract
Polygenic risk prediction is a widely investigated topic because of its promising clinical applications. Genetic variants in functional regions of the genome are enriched for complex trait heritability. Here, we introduce a method for polygenic prediction, LDpred-funct, that leverages trait-specific functional priors to increase prediction accuracy. We fit priors using the recently developed baseline-LD model, including coding, conserved, regulatory, and LD-related annotations. We analytically estimate posterior mean causal effect sizes and then use cross-validation to regularize these estimates, improving prediction accuracy for sparse architectures. We applied LDpred-funct to predict 21 highly heritable traits in the UK Biobank (avg N = 373 K as training data). LDpred-funct attained a +4.6% relative improvement in average prediction accuracy (avg prediction R2 = 0.144; highest R2 = 0.413 for height) compared to SBayesR (the best method that does not incorporate functional information). For height, meta-analyzing training data from UK Biobank and 23andMe cohorts (N = 1107 K) increased prediction R2 to 0.431. Our results show that incorporating functional priors improves polygenic prediction accuracy, consistent with the functional architecture of complex traits.
Read more

GIGYF1 loss of function is associated with clonal mosaicism and adverse metabolic health

Zhao Y, Stankovic S, Koprulu M, Wheeler E, Day FR, Lango Allen H, Kerrison ND, Pietzner M, Loh P-R, Wareham NJ, Langenberg C, Ong KK, Perry JRB. GIGYF1 loss of function is associated with clonal mosaicism and adverse metabolic health. Nat Commun 2021;12(1):4178.Abstract
Mosaic loss of chromosome Y (LOY) in leukocytes is the most common form of clonal mosaicism, caused by dysregulation in cell-cycle and DNA damage response pathways. Previous genetic studies have focussed on identifying common variants associated with LOY, which we now extend to rarer, protein-coding variation using exome sequences from 82,277 male UK Biobank participants. We find that loss of function of two genes-CHEK2 and GIGYF1-reach exome-wide significance. Rare alleles in GIGYF1 have not previously been implicated in any complex trait, but here loss-of-function carriers exhibit six-fold higher susceptibility to LOY (OR = 5.99 [3.04-11.81], p = 1.3 × 10-10). These same alleles are also associated with adverse metabolic health, including higher susceptibility to Type 2 Diabetes (OR = 6.10 [3.51-10.61], p = 1.8 × 10-12), 4 kg higher fat mass (p = 1.3 × 10-4), 2.32 nmol/L lower serum IGF1 levels (p = 1.5 × 10-4) and 4.5 kg lower handgrip strength (p = 4.7 × 10-7) consistent with proposed GIGYF1 enhancement of insulin and IGF-1 receptor signalling. These associations are mirrored by a common variant nearby associated with the expression of GIGYF1. Our observations highlight a potential direct connection between clonal mosaicism and metabolic health.
Read more

Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses

Barton AR, Sherman MA, Mukamel RE, Loh P-R. Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses. Nat Genet 2021;53(8):1260-1269.Abstract
Exome association studies to date have generally been underpowered to systematically evaluate the phenotypic impact of very rare coding variants. We leveraged extensive haplotype sharing between 49,960 exome-sequenced UK Biobank participants and the remainder of the cohort (total n ≈ 500,000) to impute exome-wide variants with accuracy R2 > 0.5 down to minor allele frequency (MAF) ~0.00005. Association and fine-mapping analyses of 54 quantitative traits identified 1,189 significant associations (P < 5 × 10-8) involving 675 distinct rare protein-altering variants (MAF < 0.01) that passed stringent filters for likely causality. Across all traits, 49% of associations (578/1,189) occurred in genes with two or more hits; follow-up analyses of these genes identified allelic series containing up to 45 distinct 'likely-causal' variants. Our results demonstrate the utility of within-cohort imputation in population-scale genome-wide association studies, provide a catalog of likely-causal, large-effect coding variant associations and foreshadow the insights that will be revealed as genetic biobank studies continue to grow.
Read more

Hematopoietic mosaic chromosomal alterations increase the risk for diverse types of infection

Zekavat SM, Lin S-H, Bick AG, Liu A, Paruchuri K, Wang C, Uddin MM, Ye Y, Yu Z, Liu X, Kamatani Y, Bhattacharya R, Pirruccello JP, Pampana A, Loh P-R, Kohli P, McCarroll SA, Kiryluk K, Neale B, Ionita-Laza I, Engels EA, Brown DW, Smoller JW, Green R, Karlson EW, Lebo M, Ellinor PT, Weiss ST, Daly MJ, Daly MJ, Daly MJ, Terao C, Zhao H, Ebert BL, Reilly MP, Ganna A, Machiela MJ, Genovese G, Natarajan P. Hematopoietic mosaic chromosomal alterations increase the risk for diverse types of infection. Nat Med 2021;27(6):1012-1024.Abstract
Age is the dominant risk factor for infectious diseases, but the mechanisms linking age to infectious disease risk are incompletely understood. Age-related mosaic chromosomal alterations (mCAs) detected from genotyping of blood-derived DNA, are structural somatic variants indicative of clonal hematopoiesis, and are associated with aberrant leukocyte cell counts, hematological malignancy, and mortality. Here, we show that mCAs predispose to diverse types of infections. We analyzed mCAs from 768,762 individuals without hematological cancer at the time of DNA acquisition across five biobanks. Expanded autosomal mCAs were associated with diverse incident infections (hazard ratio (HR) 1.25; 95% confidence interval (CI) = 1.15-1.36; P = 1.8 × 10-7), including sepsis (HR 2.68; 95% CI = 2.25-3.19; P = 3.1 × 10-28), pneumonia (HR 1.76; 95% CI = 1.53-2.03; P = 2.3 × 10-15), digestive system infections (HR 1.51; 95% CI = 1.32-1.73; P = 2.2 × 10-9) and genitourinary infections (HR 1.25; 95% CI = 1.11-1.41; P = 3.7 × 10-4). A genome-wide association study of expanded mCAs identified 63 loci, which were enriched at transcriptional regulatory sites for immune cells. These results suggest that mCAs are a marker of impaired immunity and confer increased predisposition to infections.
Read more

A model and test for coordinated polygenic epistasis in complex traits

Sheppard B, Rappoport N, Loh P-R, Sanders SJ, Zaitlen N, Dahl A. A model and test for coordinated polygenic epistasis in complex traits. Proc Natl Acad Sci U S A 2021;118(15):e1922305118.Abstract
Interactions between genetic variants-epistasis-is pervasive in model systems and can profoundly impact evolutionary adaption, population disease dynamics, genetic mapping, and precision medicine efforts. In this work, we develop a model for structured polygenic epistasis, called coordinated epistasis (CE), and prove that several recent theories of genetic architecture fall under the formal umbrella of CE. Unlike standard epistasis models that assume epistasis and main effects are independent, CE captures systematic correlations between epistasis and main effects that result from pathway-level epistasis, on balance skewing the penetrance of genetic effects. To test for the existence of CE, we propose the even-odd (EO) test and prove it is calibrated in a range of realistic biological models. Applying the EO test in the UK Biobank, we find evidence of CE in 18 of 26 traits spanning disease, anthropometric, and blood categories. Finally, we extend the EO test to tissue-specific enrichment and identify several plausible tissue-trait pairs. Overall, CE is a dimension of genetic architecture that can capture structured, systemic forms of epistasis in complex human traits.
Read more
More