Our research group is dedicated to unraveling the complexities of the human genome and cellular organization through advanced AI and machine learning (AI/ML) methods, with profound implications for understanding health and disease. Despite significant advancements in high-throughput data acquisition in genomics and cell biology, critical questions about cellular diversity, intracellular molecular organization, and spatial interactions within complex tissues remain unanswered. Addressing these challenges requires novel computational approaches capable of integrating heterogeneous, multiscale data spanning molecular, cellular, and tissue levels.
Our vision is to develop a comprehensive multiscale cellular model that elucidates intricate cellular behaviors, from normal development to disease processes. By uncovering the principles underlying genome structure, gene regulation, and cellular communication, we aim to advance understanding in critical areas such as immunology, cancer, and neurodegenerative diseases. At the forefront of state-of-the-art AI/ML development, including LLMs, we focus on creating biologically-informed models that prioritize interpretability, scalability, and adaptability, with the ultimate goal of decoding the language of cells to drive discoveries in biology and medicine.
Understanding the cellular composition and spatial organization of tissues is fundamental to decoding tissue-level function.
Emerging technologies such as single-cell and spatial multiomics offer unprecedented insights into cellular heterogeneity in tissues.
Our AI/ML work focuses on:
(1) Modeling spatial transcriptomics data to reveal the interplay between intrinsic and spatial factors shaping cell identity.
(2) Integrating multimodal single-cell epigenomic data to uncover cellular heterogeneity and its functional implications.
(3) Developing cellular foundation model.
[See selected publications in single-cell biology]
Interphase chromosomes in higher eukaryotic cells are organized in a complex 3D structure in the nucleus, yet the principles underlying this organization and its functional impact remain elusive. Leveraging advanced genome-wide mapping technologies and AI/ML methods, we explore the regulatory roles of 3D genome organization. Our goals include:
(1) Deciphering genome-wide compartmentalization patterns relative to nuclear bodies.
(2) Uncovering the principles of spatial genome organization and its effects on gene regulation.
(3) Examining 3D chromatin organization at single-cell resolution to reveal cell type-specific functions.
[See selected publications in 3D epigenome]
Ultimately, human biology must be understood in the context of evolution.
We utilize genonuc data from diverse species to study genome evolution, gene regulation, and their roles in phenotypic diversity and disease. Our goals include:
(1) Exploring the evolution of epigenome and gene regulation in mammals.
(2) Uncovering disruptios in transcriptional regulation and epigenomic landscapes that contribute to disease progression such as cancer.
[See selected publications in comparative genomics]
We develop biologically-informed AI/ML models, including LLMs, that are generalizable to a range of problems in biomedicine for multiscale and multimodal modeling. We aim to develop new frameworks that embrace both AI-in-the-loop and lab-in-the-loop. These approaches enable iterative refinement of AI/ML models while ensuring that computational outcomes guide the design of the most informative experimental efforts, fostering a new paradigm for integrating computation and experimentation.
[See selected publications in machine learning]