TY - JOUR
T1 - Gene2DGE
T2 - A Perl Package for Gene Model Renewal with Digital Gene Expression Data
AU - Tang, Xiaoli
AU - Deng, Libin
AU - Zhang, Dake
AU - Lin, Jiari
AU - Wei, Yi
AU - Zhou, Qinqin
AU - Li, Xiang
AU - Li, Guilin
AU - Liang, Shangdong
PY - 2012/2
Y1 - 2012/2
N2 - For transcriptome analysis, it is critical to precisely define all the transcripts across the whole genome. More and more digital gene expression (DGE) scannings have indicated the presence of huge amount of novel transcripts in addition to the known gene models. However, almost all these studies still depend crucially on existing annotation. Here, we present Gene2DGE, a Perl software package for gene model renewal with DGE data. We applied Gene2DGE to the mouse blastomere transcriptome, and defined 98,532 read-enriched regions (RERs) by read clustering supported by more than four reads for each base pair. Taking advantage of this . ab initio method, we refined 2,104 exonic regions (4% of a total of 48,501 annotated transcribed regions) with remarkable extension into un-annotated regions (>50 bp). For 5% of uniquely mapped reads falling within intron regions, we identified 13,291 additional possible exons. As a result, we renewed 4,788 gene models, which account for 39% of a total of 12,277 transcribed genes. Furthermore, we identified 12,613 intergenic RERs, suggesting the possible presence of novel genes outside the existing gene models. In this study, therefore, we have developed a suitable tool for renewal of known gene models by . ab initio prediction in transcriptome dissection. The Gene2DGE package is freely available at . http://bighapmap.big.ac.cn/.
AB - For transcriptome analysis, it is critical to precisely define all the transcripts across the whole genome. More and more digital gene expression (DGE) scannings have indicated the presence of huge amount of novel transcripts in addition to the known gene models. However, almost all these studies still depend crucially on existing annotation. Here, we present Gene2DGE, a Perl software package for gene model renewal with DGE data. We applied Gene2DGE to the mouse blastomere transcriptome, and defined 98,532 read-enriched regions (RERs) by read clustering supported by more than four reads for each base pair. Taking advantage of this . ab initio method, we refined 2,104 exonic regions (4% of a total of 48,501 annotated transcribed regions) with remarkable extension into un-annotated regions (>50 bp). For 5% of uniquely mapped reads falling within intron regions, we identified 13,291 additional possible exons. As a result, we renewed 4,788 gene models, which account for 39% of a total of 12,277 transcribed genes. Furthermore, we identified 12,613 intergenic RERs, suggesting the possible presence of novel genes outside the existing gene models. In this study, therefore, we have developed a suitable tool for renewal of known gene models by . ab initio prediction in transcriptome dissection. The Gene2DGE package is freely available at . http://bighapmap.big.ac.cn/.
KW - Ab initio prediction
KW - Annotation
KW - Transcriptome
UR - https://www.scopus.com/pages/publications/84863351894
U2 - 10.1016/S1672-0229(11)60033-8
DO - 10.1016/S1672-0229(11)60033-8
M3 - 文章
C2 - 22449401
AN - SCOPUS:84863351894
SN - 1672-0229
VL - 10
SP - 51
EP - 54
JO - Genomics, Proteomics and Bioinformatics
JF - Genomics, Proteomics and Bioinformatics
IS - 1
ER -