A Linear Time Pattern Matching Algorithm between a String and a Tree

Tatsuya AKUTSU  

IEICE TRANSACTIONS on Information and Systems   Vol.E77-D   No.3   pp.281-287
Publication Date: 1994/03/25
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Algorithm and Computational Complexity
subtree,  subgraph isomorphism,  string matching,  suffix tree,  graph algorithms,  

Full Text: PDF>>
Buy this Article

This paper presents a linear time algorithm for testing whether or not there is a path <v1,・・・,vm> of an undiercted tree T (|V(T)|n) that coincides with a string ss1・・・sm (i.e., label(v1)・・・label(vm)s1・・・sm). Since any path of the tree is allowed, linear time substring matching algorithms can not be directly applied and a new method is developed. In the algorithm, O(n/m) vertices are selected from V(T) such that any path pf length more than m 2 must contain at least one of the selected vertices. A search is performed using the selected vertices as 'bases' and two tables of size O(m) are constructed for each of the selected vertices. A suffix tree, which is a well-known-data structure in string matching, is used effectively in the algorithm. From each of the selected vertices, a search is performed with traversing the suffix tree associated with s. Although the size of the alphabet is assumed to be bounded by a constant in this paper, the algorithm can be applied to the case of unbounded alphabets by increasing the time complexity to O(n log m).