TADfit is a multivariate linear regression model for profiling hierarchical chromatin domains on replicate Hi-C data

  • Erhu Liu
  • , Hongqiang Lyu*
  • , Qinke Peng
  • , Yuan Liu
  • , Tian Wang
  • , Jiuqiang Han
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Topologically associating domains (TADs) are fundamental building blocks of three dimensional genome, and organized into complex hierarchies. Identifying hierarchical TADs on Hi-C data helps to understand the relationship between genome architectures and gene regulation. Herein we propose TADfit, a multivariate linear regression model for profiling hierarchical chromatin domains, which tries to fit the interaction frequencies in Hi-C contact matrix with and without replicates using all-possible hierarchical TADs, and the significant ones can be determined by the regression coefficients obtained with the help of an online learning solver called Follow-The-Regularized-Leader (FTRL). Beyond the existing methods, TADfit has an ability to handle multiple contact matrix replicates and find partially overlapping TADs on them, which helps to find the comprehensive underlying TADs across replicates from different experiments. The comparative results tell that TADfit has better accuracy and reproducibility, and the hierarchical TADs called by it exhibit a reasonable biological relevance.

Original languageEnglish
Article number608
JournalCommunications Biology
Volume5
Issue number1
DOIs
StatePublished - Dec 2022

Fingerprint

Dive into the research topics of 'TADfit is a multivariate linear regression model for profiling hierarchical chromatin domains on replicate Hi-C data'. Together they form a unique fingerprint.

Cite this