Querying Web Pages with Lattice Expressions

Ping-Yu HSU  

IEICE TRANSACTIONS on Information and Systems   Vol.E82-D   No.1   pp.156-164
Publication Date: 1999/01/25
Online ISSN: 
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Issue on New Generation Database Technologies)
Category: Web and Document Databases
WWW,  internet,  information retrieval,  query language,  lattices,  model,  

Full Text: PDF(227.6KB)>>
Buy this Article

To provide users with database-like query interfaces on HTML data, several systems have been developed to extract structures from HTML pages. Among them, tree-like structures and path expressions are the most popular modeling and navigating tools, respectively. Although path expressions are straightforward in representing top-down search patterns, they provide very limited help in representing bottom-up and in-breadth search patterns. In this paper, a lattice model is proposed to store Web data. The model provides an integrated mechanism to store text, linking information, HTML hierarchy, and sequence order of HTML data. By incorporating lattice operators with comprehension syntax, we show that the query language can represent top-down, bottom-up, and in-breadth searching patterns with uniform operators. It will be also shown that lattice comprehensions can represent all operators of path expressions, except Kleen closure.