Mengyuan Kan

A clear mind brings blessings~

Multiomics Integration to Identify Novel Disease Mechanisms

Omics approaches have advanced our understanding of complex respiratory diseases. However, our ability to generate omics data far exceeds our ability to interpret and validate findings that are biologically informed. Most omics studies characterize biological information from single modalities. Integrative analysis of multiomics data provides insights into disease mechanisms beyond those from single-layered omics data.

We used an asthma-related phenotype, glucocorticoid response, as a study model. Glucocorticoids, commonly used drugs for the treatment of asthma, are known to exert anti-inflammatory effects via binding to glucocorticoid receptors (GRs), a type of transcription factors, and modulating gene transcription. Some asthma patients respond pooly to glucocorticoids, in part due to genetic differences. Yet genome-wide association studies (GWAS) of glucocorticoid response did not find reproducible genetic associations that reach genome-wide levels of statistical significance.

Our previous transcriptomic integration study identified airway smooth muscle-specific gene expression signature of glucocorticoid response. To leverage nominal genetic associations, we developed multiomics integrative scores to rank these variants based on their functional annotations inferred from transcripotmic, ChIP-Seq, DNA motif, and eQTL data. This enabled us identify variants near the gene BIRC3 as a novel genetic locus that might influence patients’ response to glucocorticoids via modulation of GR signaling in airway cells.


This multiomics integrative score can be extended to prioritize norminal genetic associations for other complex phenotypes for further mechanistic studies.

Leverage Public Omics Datasets to Test Novel Hypotheses

With the advent of high throughput omics technologies, the volume of publicly available omics data has increased ever since. These data include experiments that compared disease versus healthy individuals as well as cells exposed to drugs versus vehicle control. Leveraging existing datasets offers experimental researchers a cost-effective avenue to test their novel hypotheses on disease mechanisms. Although these datasets are valuable, in their raw form, they are not helpful to experimental researchers who lack adequate computational resources or experience analyzing omics data.

To facilitate the reproducible analysis of publicly available omics data, we developed pipelines RAVED for transcriptomic data analysis and brocade for ChIP-Seq data analysis, and and an web app REALGAR that integrates analysis-ready omics data and allows end-users to visualize integrated results on-the-fly.

These open-source tools enabled identification of tissue-specific differentially expressed genes and differential transcription factor binding sites related to asthma and drug response, and prioritization of asthma-associated variants that might contribute to asthma via their influence on glucocorticoid receptor-modulated glucocorticoid response. Here is a nice illustration by Yoson when I presented RAVED and REALGAR in IBI retreat.


I collaborate with molecular biologists to analyze and interpret their omics data using these open-source tools and pipelines and have made significant scientific discoveries.





google scholar twitter

github github