|
For Full-Text PDF, please login, if you are a member of IEICE,
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
|
A Fast Parallel Algorithm for Indexing Human Genome Sequences
Woong-Kee LOH Kyoung-Soo HAN
Publication
IEICE TRANSACTIONS on Information and Systems
Vol.E97-D
No.5
pp.1345-1348 Publication Date: 2014/05/01 Online ISSN: 1745-1361
DOI: 10.1587/transinf.E97.D.1345 Type of Manuscript: LETTER Category: Data Engineering, Web Information Systems Keyword: human genome sequences, suffix tree, parallel algorithm, suffix array, disk-based index,
Full Text: PDF>>
Summary:
A suffix tree is widely adopted for indexing genome sequences. While supporting highly efficient search, the suffix tree has a few shortcomings such as very large size and very long construction time. In this paper, we propose a very fast parallel algorithm to construct a disk-based suffix tree for human genome sequences. Our algorithm constructs a suffix array for part of the suffixes in the human genome sequence and then converts it into a suffix tree very quickly. It outperformed the previous algorithms by Loh et al. and Barsky et al. by up to 2.09 and 3.04 times, respectively.
|
|