Haplotype phasing

We develop computational tools to solve statistical and algorithmic challenges in quantitative genetics.

We are based in the Division of Genetics and Center for Data Sciences at Brigham and Women's Hospital / Harvard Medical School. We are affiliated with the Program in Medical and Population Genetics at the Broad Institute.

Our work is generously supported by an NIH Director's New Innovator Award, a Burroughs Wellcome Fund Career Award at the Scientific Interfaces, and a Broad Institute Next Generation Fund award, and we are grateful for past support from a Glenn Foundation for Medical Research and AFAR Grant for Junior Faculty and a Sloan Research Fellowship.

Latest News

New preprint on exome imputation and fine-mapping in UK Biobank

September 1, 2020
We are excited to share a new preprint, "Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses" (Barton et al.), which reports an atlas of 1,189 rare coding variants likely to causally influence 54 quantitative traits in UK Biobank. This analysis leveraged the initial release of N=49,960 UKB exomes to accurately impute variants with MAF down to ~0.00005 into the full cohort (N~500K). Association and... Read more about New preprint on exome imputation and fine-mapping in UK Biobank

New preprint on mosaic copy number variants in autism

January 28, 2020
We are excited to share a new preprint, "Large mosaic copy number variations confer autism risk" (Sherman et al.), reporting mosaic CNVs we identified in genotyping data from the Simons Simplex Collection. We demonstrate a significant burden of large (>4 Mb) mosaic CNVs in ASD probands compared to their siblings; several probands exhibited clinical symptoms known to arise from disruption of the affected genomic regions. Read more about New preprint on mosaic copy number variants in autism

Talks and posters from our group at ASHG 2019

October 16, 2019
At ASHG this year, Max Sherman will be speaking about his work on mosaic CNVs in autism (Thu 10/17 at 9:45am, #105) and Po-Ru Loh will be speaking about his work on clonal hematopoiesis in UK Biobank (Sat 10/19 at 10:45am, #343). Ronen Mukamel and Alison Barton will be presenting posters exploring the phenotypic effects of variable number tandem repeats (#2994W) and rare coding variants (#3029F).

Recent Publications

Monogenic and polygenic inheritance become instruments for clonal selection

Loh P-R, Genovese G, McCarroll SA. Monogenic and polygenic inheritance become instruments for clonal selection. Nature 2020;584(7819):136-141.Abstract
Clonally expanded blood cells that contain somatic mutations (clonal haematopoiesis) are commonly acquired with age and increase the risk of blood cancer. The blood clones identified so far contain diverse large-scale mosaic chromosomal alterations (deletions, duplications and copy-neutral loss of heterozygosity (CN-LOH)) on all chromosomes, but the sources of selective advantage that drive the expansion of most clones remain unknown. Here, to identify genes, mutations and biological processes that give selective advantage to mutant clones, we analysed genotyping data from the blood-derived DNA of 482,789 participants from the UK Biobank. We identified 19,632 autosomal mosaic chromosomal alterations and analysed these for relationships to inherited genetic variation. We found 52 inherited, rare, large-effect coding or splice variants in 7 genes that were associated with greatly increased vulnerability to clonal haematopoiesis with specific acquired CN-LOH mutations. Acquired mutations systematically replaced the inherited risk alleles (at MPL) or duplicated them to the homologous chromosome (at FH, NBN, MRE11, ATM, SH2B3 and TM2D3). Three of the genes (MRE11, NBN and ATM) encode components of the MRN-ATM pathway, which limits cell division after DNA damage and telomere attrition; another two (MPL and SH2B3) encode proteins that regulate the self-renewal of stem cells. In addition, we found that CN-LOH mutations across the genome tended to cause chromosomal segments with alleles that promote the expansion of haematopoietic cells to replace their homologous (allelic) counterparts, increasing polygenic drive for blood-cell proliferation traits. Readily acquired mutations that replace chromosomal segments with their homologous counterparts seem to interact with pervasive inherited variation to create a challenge for lifelong cytopoiesis.
Read more

Chromosomal alterations among age-related haematopoietic clones in Japan

Terao C, Suzuki A, Momozawa Y, Akiyama M, Ishigaki K, Yamamoto K, Matsuda K, Murakami Y, McCarroll SA, Kubo M, Loh P-R, Kamatani Y. Chromosomal alterations among age-related haematopoietic clones in Japan. Nature 2020;584(7819):130-135.Abstract
The extent to which the biology of oncogenesis and ageing are shaped by factors that distinguish human populations is unknown. Haematopoietic clones with acquired mutations become common with advancing age and can lead to blood cancers. Here we describe shared and population-specific patterns of genomic mutations and clonal selection in haematopoietic cells on the basis of 33,250 autosomal mosaic chromosomal alterations that we detected in 179,417 Japanese participants in the BioBank Japan cohort and compared with analogous data from the UK Biobank. In this long-lived Japanese population, mosaic chromosomal alterations were detected in more than 35.0% (s.e.m., 1.4%) of individuals older than 90 years, which suggests that such clones trend towards inevitability with advancing age. Japanese and European individuals exhibited key differences in the genomic locations of mutations in their respective haematopoietic clones; these differences predicted the relative rates of chronic lymphocytic leukaemia (which is more common among European individuals) and T cell leukaemia (which is more common among Japanese individuals) in these populations. Three different mutational precursors of chronic lymphocytic leukaemia (including trisomy 12, loss of chromosomes 13q and 13q, and copy-neutral loss of heterozygosity) were between two and six times less common among Japanese individuals, which suggests that the Japanese and European populations differ in selective pressures on clones long before the development of clinically apparent chronic lymphocytic leukaemia. Japanese and British populations also exhibited very different rates of clones that arose from B and T cell lineages, which predicted the relative rates of B and T cell cancers in these populations. We identified six previously undescribed loci at which inherited variants predispose to mosaic chromosomal alterations that duplicate or remove the inherited risk alleles, including large-effect rare variants at NBN, MRE11 and CTU2 (odds ratio, 28-91). We suggest that selective pressures on clones are modulated by factors that are specific to human populations. Further genomic characterization of clonal selection and cancer in populations from around the world is therefore warranted.
Read more

Liability threshold modeling of case-control status and family history of disease increases association power

Hujoel MLA, Gazal S, Loh P-R, Patterson N, Price AL. Liability threshold modeling of case-control status and family history of disease increases association power. Nat Genet 2020;52(5):541-547.Abstract
Family history of disease can provide valuable information in case-control association studies, but it is currently unclear how to best combine case-control status and family history of disease. We developed an association method based on posterior mean genetic liabilities under a liability threshold model, conditional on case-control status and family history (LT-FH). Analyzing 12 diseases from the UK Biobank (average N = 350,000) we compared LT-FH to genome-wide association without using family history (GWAS) and a previous proxy-based method incorporating family history (GWAX). LT-FH was 63% (standard error (s.e.) 6%) more powerful than GWAS and 36% (s.e. 4%) more powerful than the trait-specific maximum of GWAS and GWAX, based on the number of independent genome-wide-significant loci across all diseases (for example, 690 loci for LT-FH versus 423 for GWAS); relative improvements were similar when applying BOLT-LMM to GWAS, GWAX and LT-FH phenotypes. Thus, LT-FH greatly increases association power when family history of disease is available.
Read more

Genetic predisposition to mosaic Y chromosome loss in blood

Thompson DJ, Genovese G, Halvardson J, Ulirsch JC, Wright DJ, Terao C, Davidsson OB, Day FR, Sulem P, Jiang Y, Danielsson M, Davies H, Dennis J, Dunlop MG, Easton DF, Fisher VA, Zink F, Houlston RS, Ingelsson M, Kar S, Kerrison ND, Kinnersley B, Kristjansson RP, Law PJ, Li R, Loveday C, Mattisson J, McCarroll SA, Murakami Y, Murray A, Olszewski P, Rychlicka-Buniowska E, Scott RA, Thorsteinsdottir U, Tomlinson I, Moghadam BT, Turnbull C, Wareham NJ, Gudbjartsson DF, Kamatani Y, Hoffmann ER, Jackson SP, Stefansson K, Auton A, Ong KK, Machiela MJ, Loh P-R, Dumanski JP, Chanock SJ, Forsberg LA, Perry JRB. Genetic predisposition to mosaic Y chromosome loss in blood. Nature 2019;575(7784):652-657.Abstract
Mosaic loss of chromosome Y (LOY) in circulating white blood cells is the most common form of clonal mosaicism, yet our knowledge of the causes and consequences of this is limited. Here, using a computational approach, we estimate that 20% of the male population represented in the UK Biobank study (n = 205,011) has detectable LOY. We identify 156 autosomal genetic determinants of LOY, which we replicate in 757,114 men of European and Japanese ancestry. These loci highlight genes that are involved in cell-cycle regulation and cancer susceptibility, as well as somatic drivers of tumour growth and targets of cancer therapy. We demonstrate that genetic susceptibility to LOY is associated with non-haematological effects on health in both men and women, which supports the hypothesis that clonal haematopoiesis is a biomarker of genomic instability in other tissues. Single-cell RNA sequencing identifies dysregulated expression of autosomal genes in leukocytes with LOY and provides insights into why clonal expansion of these cells may occur. Collectively, these data highlight the value of studying clonal mosaicism to uncover fundamental mechanisms that underlie cancer and other ageing-related diseases.
Read more

Fast, sensitive and accurate integration of single-cell data with Harmony

Korsunsky I, Millard N, Fan J, Slowikowski K, Zhang F, Wei K, Baglaenko Y, Brenner M, Loh P-R, Raychaudhuri S. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat Methods 2019;16(12):1289-1296.Abstract
The emerging diversity of single-cell RNA-seq datasets allows for the full transcriptional characterization of cell types across a wide variety of biological and clinical conditions. However, it is challenging to analyze them together, particularly when datasets are assayed with different technologies, because biological and technical differences are interspersed. We present Harmony (https://github.com/immunogenomics/harmony), an algorithm that projects cells into a shared embedding in which cells group by cell type rather than dataset-specific conditions. Harmony simultaneously accounts for multiple experimental and biological factors. In six analyses, we demonstrate the superior performance of Harmony to previously published algorithms while requiring fewer computational resources. Harmony enables the integration of ~10 cells on a personal computer. We apply Harmony to peripheral blood mononuclear cells from datasets with large experimental differences, five studies of pancreatic islet cells, mouse embryogenesis datasets and the integration of scRNA-seq with spatial transcriptomics data.
Read more

GWAS of mosaic loss of chromosome Y highlights genetic effects on blood cell differentiation

Terao C, Momozawa Y, Ishigaki K, Kawakami E, Akiyama M, Loh P-R, Genovese G, Sugishita H, Ohta T, Hirata M, Perry JRB, Matsuda K, Murakami Y, Kubo M, Kamatani Y. GWAS of mosaic loss of chromosome Y highlights genetic effects on blood cell differentiation. Nat Commun 2019;10(1):4719.Abstract
Mosaic loss of chromosome Y (mLOY) is frequently observed in the leukocytes of ageing men. However, the genetic architecture and biological mechanisms underlying mLOY are not fully understood. In a cohort of 95,380 Japanese men, we identify 50 independent genetic markers in 46 loci associated with mLOY at a genome-wide significant level, 35 of which are unreported. Lead markers overlap enhancer marks in hematopoietic stem cells (HSCs, P ≤ 1.0 × 10). mLOY genome-wide association study signals exhibit polygenic architecture and demonstrate strong heritability enrichment in regions surrounding genes specifically expressed in multipotent progenitor (MPP) cells and HSCs (P ≤ 3.5 × 10). ChIP-seq data demonstrate that binding sites of FLI1, a fate-determining factor promoting HSC differentiation into platelets rather than red blood cells (RBCs), show a strong heritability enrichment (P = 1.5 × 10). Consistent with these findings, platelet and RBC counts are positively and negatively associated with mLOY, respectively. Collectively, our observations improve our understanding of the mechanisms underlying mLOY.
Read more