CAREER: Computational and statistical methods for allele-specific chromatin structure analysis

NSF-DBI-ABI 1751317

News

  • 2021-May Wenxiu presents ‘‘Statistical and computational methods for analyzing chromatin spatial organization data’’ at the Women In Data Science (WiDS) Riverside event.

  • 2020-Sep Our ASHIC paper is accepted by Nucleic Acids Research.

  • 2019-Dec Wenxiu presents ‘‘Statistical and computational methods for analyzing chromatin spatial organization data’’ at the 11th ICSA International Conference at Hangzhou, China.

  • 2019-Nov Wenxiu presents ‘‘Statistical and computational methods for analyzing chromatin spatial organization data’’ at the Special Session on Data Science of the AMS Fall Western Sectional Meeting at UCR.

  • 2019-Jan Wenxiu presents ‘‘Statistical and computational methods for analyzing chromatin spatial organization data’’ at the Statistical Genomics Workshop at the Plant & Animal Genome Conference XXVII at San Diego, CA.

  • 2018-Apr Wenxiu presents ‘‘Statistical and computational methods for analyzing chromatin spatial organization data’’ at the 2018 NSF Project / Bioinformatics Workshop organized by Noble Research Institute, Michigan Technology University, and UCR.

  • 2018-Apr Official start of the project

Project Goals

Three-dimensional (3D) genome organization plays an important role in gene regulation. One level of this organization consists of DNA wrapped around histone proteins, and is called the chromatin. High-throughput chromatin conformation capture methods (one example is called the Hi-C assay) have been developed, and yield an immense amount of information about 3D genome organization. However, most current analysis tools cannot distinguish the Hi-C, or equivalent, information that comes from the paired (homologous) maternal and paternal chromosomes in diploid organisms (like humans and other mammals). This means it is not possible to tell if there are different effects arising from the maternal and paternal copies of genes (the alleles). This project will address this problem and allow the development of fine-scale, allele-specific chromatin structures and therefore shed light on the role(s) of chromatin interactions in allelic gene regulation as well as larger principles of genome organization.

The goal of this project is to (i) establish a new computational and statistical framework for modeling the 3D chromatin structures in an allele-specific manner; (ii) identify structural differences between homologous chromosome pairs; (iii) investigate the impact of chromatin organization on allelic gene regulation; and (iv) understand the interplay between genome architecture and gene function. The project will integrate fine-scale allele-specific chromatin structures with the currently overwhelming amount of one-dimensional functional genomics data to discover new allele-specific regulatory elements and features. The project will elucidate gene regulation principles at an unprecedented resolution, and enhance our understanding of the interplay between genome architecture and gene expression. These findings will have fundamental significance in molecular cell biology, personal genomics, and medicine.

Broader Impacts

This research will result in novel computational and statistical methods that combine the analysis of allele-specific chromatin structure with gene expression regulation; the products will include open-source software tools for 3D genome modeling, comparison, visualization, and exploration. These software tools will be made publicly accessible to scientists worldwide. The integrated research and educational activities include curriculum development for both undergraduate and graduate courses in subjects including data science, and statistical and computational genomics. Activities will allow undergraduate students to participate in the research project, as well as training graduate student researchers to acquire interdisciplinary expertise. The project will reach out particularly to middle school students with the goal of engaging young women and underrepresented minority groups in STEM disciplines.

People

Wenxiu Ma, PI

Christine Disteche, Collaborator (University of Washington)

Joel Berletch, Collaborator (University of Washington)

Xin Gao, Collaborator (King Abdullah University of Science and Technology)

Tiantian Ye, PhD student (Genetics, Genomics & Bioinformatics)

Yangyang Hu, PhD student (Computer Science)

Huiling Liu, PhD student (Applied Statistics)

Jinli Zhang, PhD student (Genetics, Genomics & Bioinformatics)

Li Ma, Postdoctoral scholar

Sydney Pun, Undergraduate student (Computer Science)

Publications

Software

We are developing and implementing a software suite for allele-specific genome modeling, comparison, visualization, and exploration. Each of these software modules and components will be made freely available upon publication.

Education and Outreach

Curriculum development

  • STAT 167 (Introduction to Data Science), Spring 2017–2022. [Syllabus]

  • STAT 209 (Software Tools for Big Data Analysis), Spring 2021

Outreach activities

  • Wenxiu gave a data wrangling workshop at the 1st annual Datathon event, organized by Mass Initiative in Data Science (MINDS), a student-run data science organization for high school students supported by the Orange County/Long Beach chapter of the American Statistical Association.

Acknowledgments

This project is supported by the National Science Foundation under Grant No. 1751317.

Point of Contact

Wenxiu Ma