Logical Structure Analysis of Document Images Based on Emergent Computation


IEICE TRANSACTIONS on Information and Systems   Vol.E88-D   No.8   pp.1831-1842
Publication Date: 2005/08/01
Online ISSN: 
DOI: 10.1093/ietisy/e88-d.8.1831
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Section on Document Image Understanding and Digital Documents)
Category: Document Structure
document image analysis,  logical structure analysis,  layout analysis,  artificial life,  emergent computation,  

Full Text: PDF>>
Buy this Article

A new method for logical structure analysis of document images is proposed in this paper as the basis for a document reader which can extract logical information from various printed documents. The proposed system consists of five basic modules: text line classification, object recognition, object segmentation, object grouping, and object modification. Emergent computation, which is a key concept of artificial life, is adopted for the cooperative interaction among modules in the system in order to achieve effective and flexible behavior of the whole system. It has three principal advantages over other methods: adaptive system configuration for various and complex logical structures, robust document analysis tolerant of erroneous feature detection, and feedback of high-level logical information to the low-level physical process for accurate analysis. Experimental results obtained for 150 documents show that the method is adaptable, robust, and effective for various document structures.