跳到主要导航 跳到搜索 跳到主要内容

RRCA: Ultra-fast multiple in-species genome alignments

  • Humboldt University of Berlin

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Multiple sequence alignment is an important method in Bioinformatics, for instance, to reconstruct phylogenetic trees or for identifying functional domains within genes. Finding an optimal MSA is computationally intractable, and therefore many alignment heuristics were proposed. However, computing MSA for sequences at chromosome/genome scale in a reasonable time with good alignment results remains an open challenge. In this paper we propose RRCA, a very fast method to compute high-quality in-species MSAs at genome scale. RRCA uses referential compression to efficiently find long common subsequences in to-be-aligned sequences. A colinear sub collection of these subsequences is used for an initial alignment and the not yet covered subsequences are aligned following the same approach recursively. Our evaluation shows that RRCA achieves MSAs at similar quality as current state-of-the-art methods, while often being orders of magnitude faster for all our datasets. For instance, RRCA aligns eight human Chromosome 22 (around 50 MB each) within one minute on a consumer computer; a task that takes hours to days with competitors.

源语言英语
主期刊名Algorithms for Computational Biology - First International Conference, AlCoB 2014, Proceedings
出版商Springer Verlag
247-261
页数15
ISBN(印刷版)9783319079523
DOI
出版状态已出版 - 2014
已对外发布
活动1st International Conference on Algorithms for Computational Biology, AlCoB 2014 - Tarragona, 西班牙
期限: 1 7月 20143 7月 2014

出版系列

姓名Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
8542 LNBI
ISSN(印刷版)0302-9743
ISSN(电子版)1611-3349

会议

会议1st International Conference on Algorithms for Computational Biology, AlCoB 2014
国家/地区西班牙
Tarragona
时期1/07/143/07/14

指纹

探究 'RRCA: Ultra-fast multiple in-species genome alignments' 的科研主题。它们共同构成独一无二的指纹。

引用此