A Storage-Efficient Suffix Tree Construction Algorithm for Human Genome Sequences

Woong-Kee LOH  Heejune AHN  

IEICE TRANSACTIONS on Information and Systems   Vol.E94-D   No.12   pp.2557-2560
Publication Date: 2011/12/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E94.D.2557
Print ISSN: 0916-8532
Type of Manuscript: LETTER
Category: Biological Engineering
storage-efficient suffix tree,  human genome sequences,  divide-and-conquer,  

Full Text: PDF(404.3KB)>>
Buy this Article

The suffix tree is one of most widely adopted indexes in the application of genome sequence alignment. Although it supports very fast alignment, it has a couple of shortcomings, such as a very long construction time and a very large volume size. Loh et al. [7] proposed a suffix tree construction algorithm with dramatically improved performance; however, the size still remains as a challenging problem. We propose an algorithm by extending the one by Loh et al. to reduce the suffix tree size. As a result of our experiments, our algorithm constructed a suffix tree of approximately 60% of the size within almost the same time period.